Edit: In case it’s not clear, you should not use this.
benreesman 23 minutes ago [-]
The thing you want has a kind of academic jargon name (coeffects algebra with graded/indexed monads for discharge) but is very intuitive, and it can do useful and complete attestation without compromising anyone credentials (in the limit case because everyone chooses what proxy to run).
I am OK with people publishing new ideas to the web as long as they are 100% honest and admit they just had an idea and asked an AI to build it for themselves. That way I can understand they may not have considered all the things that needs to be considered and I can skip it (and then prompt it myself if I want to, adding all the things I consider necesary)
Lucasoato 10 hours ago [-]
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Balinares 7 hours ago [-]
Ah, getting the job done by disabling important validation, if that isn't the most prominent Opus trait...
I wonder how much this will end up costing the industry in aggregate.
catlifeonmars 6 hours ago [-]
I’m thinking of pivoting into cybersecurity. I suspect that’s where the all money will be in the next couple of years.
philipwhiuk 4 hours ago [-]
At least until the pivot by Claude et al from AI for work to AI for cybersec analysis.
xmcqdpt2 7 hours ago [-]
Not entirely different from many human engineers...
philipwhiuk 4 hours ago [-]
Indeed - most of my StackOverflow credit is for explaining TLS config options.
arowthway 6 hours ago [-]
Don't use it if you plan to auto accept terminal commands, without a sandbox, while on a public wifi in a cafe, next to a hacker who decides to bet on you running a very niche configuration.
catlifeonmars 6 hours ago [-]
All you need is to manipulate DNS, inject a record with a long TTL and the rest is not required.
It scales very well and I guarantee this is not the only instance of misconfigured host verification. In other words, this is not as niche as you might think.
arowthway 5 hours ago [-]
If you're able to manipulate DNS, can't you just issue your own certificate for the domain? Even if it would be revoked moments later, mitmproxy doesnt check it even when ssl_insecure=false:
EDIT:
Maybe I incorrectly assumed you meant authoritative DNS.
catlifeonmars 5 hours ago [-]
You got it, authoritative not necessary. It just needs to be your router, your ISPs resolver, or the one at your public library/coffee shop/hotel etc. I’d throw BGP route poisoning in there too, but then you have much bigger problems lol.
Like you pointed out in your original post, this would be expensive to run as a targeted attack, but it has good unit economics if you scale it up, wait, and then harvest.
jmuncor 14 hours ago [-]
Just fixed it and implemented a simple http relay, eliminating the mitmproxy and the ssl_insecure=true. The new implementation uses TLS verification, doing last tests and merging it... After the merge can you check it out and tell me if I earned your star? :D
catlifeonmars 13 hours ago [-]
I’m not sure you fully understand the implications of the misconfiguration of mitmproxy there. Effectively you provided an easily accessible front door for remote code execution on a user’s machine.
No offense, but I wouldn’t trust anything else you published.
I think it’s great that you are learning and it is difficult to put yourself out there and publish code, but what you originally wrote had serious implications and could have caused real harm to users.
jmuncor 13 hours ago [-]
Ohh my, no offense taken... The next time I will be a lot more careful with the stuff that I put out there. Learning and getting the hang of it, would love if you either comment on the code or here any other things you think could be improved. I am in the process of getting better and appreciate all the blunt and transparent feedback. No one grows out of praise.
badeeya 3 hours ago [-]
it's incredible that people pointed out very specifically what's wrong and you fell back to weaponized incompetence to shift the intellectual and mental burden of reviewing the code to outsiders instead of thinking for yourself. this is the problem with relying on LLM,s instead of thinking for yourself you just ask LLMs, and now other real people "idk just fix it for me make it work". do you really not see the problem with this?
lionkor 5 hours ago [-]
I don't think you can get professionals to review code that you didn't even bother typing yourself.
You aren't learning much. You're vibe coding, which means you learn almost nothing, except maybe prompting your LLM better.
jurgenaut23 7 hours ago [-]
No, you’re in the process of vibe coding stuff you don’t understand and you will most likely never understand until you take the time to open a book.
ratg13 7 hours ago [-]
Your comment contains nothing but insults.
This is not a place for you to try and make yourself feel better by disparaging others.
jurgenaut23 5 hours ago [-]
You might find my comment insulting but saying that it contains insults is inaccurate.
Also, OP claims that he is here to learn, but he is mostly chasing cheap GH stars to boost his resume. How insulting is that?
throwaway277432 13 hours ago [-]
>tell me if I earned your star
Since you asked: Not in a million years, no.
A bug of this type is either an honest typo or a sign that the author(s) don't take security seriously. Even if it were a typo, any serious author would've put a large FIXME right there when adding that line disabling verification. I know I would. In any case a huge red flag for a mitm tool.
Seeing that it's vibe coded leads me believe it's due to AI slop, not a simple typo from debugging.
jmuncor 13 hours ago [-]
I love the real feedback tbh, I am still learning, and want to learn as much as possible. Would love if you can review it and tell me bluntly either in the repo or here the things that should be improved. I would love to learn more from you and get better :D
badeeya 3 hours ago [-]
it is incredible that people pointed out very specifically what's wrong and you fell back to weaponized incompetence to shift the intellectual and mental burden of reviewing the code to outsiders instead of thinking for yourself. this is the problem with relying on LLM,s instead of thinking for yourself you just ask LLMs, and now other real people "idk just fix it for me make it work". do you really not see the problem with this?
throwaway277432 13 hours ago [-]
I'm not going to review it in full, sorry. Reviewing is so much more effort compared to producing something with AI. But don't let me deter you, keep on learning and keep on building.
I wish I had the possibilities to learn and build on such a large scale when I started out. AI is a blessing and a curse I guess.
My own early projects were most definitely crap, and I made the exact same mistakes in the past. Honestly my first attempts were surely worse. But my projects were also tiny and incomplete, so I never published them.
However: What little parts I did publish as open-source or PRs were meticulously reviewed before ever hitting send, and I knew these inside and out and they were as good as I could make it.
Vibe-coded software is complete but never as good as you could make it, so the effort in reviewing it is mostly wasted.
I guess what I'm trying to say is I'm a bit tired of seeing student-level projects on HN / Github cosplaying as production ready software built by an experienced engineer. It used to be possible to distinguish these from the README or other cues, but nowadays they all look professional and are unintentionally polluting the software space when I'm actually looking for something.
Please understand that this is not specifically directed at you, it's pent up frustration from reading HN projects over the last months. Old guy yelling at clouds.
CurleighBraces 11 hours ago [-]
The README is really annoying.
You used to be able to tell so easily what was a good well looked after repo by viewing the effort and detail that had gone into the README.
Now it's too easy to slop up a README.
gr4vityWall 12 hours ago [-]
I appreciate that attitude. Keep it up.
jamespo 13 hours ago [-]
unlikely to get that from a throwaway
jmuncor 13 hours ago [-]
You can always try right?
antonvs 11 hours ago [-]
Only if you don’t care about your reputation.
“Give me your time for free” is not the kind of request that earns respect.
ewuhic 8 hours ago [-]
You don't understand what you're doing, and never will. Throw away all computing devices you've got.
monkaiju 6 hours ago [-]
And it's already surpassed my most starred project when it was on GitHub, all the more validating to have moved it to forgejo. If vibecoded stuff with unbelievable security vulns can get so much praise the whole star system doesn't work as a quality filter. Similarly a well crafted README used to help reflect quality, no longer...
13 hours ago [-]
ctippett 20 hours ago [-]
As someone who just set up mitmproxy to do something very similar, I wish this would've been a plugin/add-on instead of a standalone thing.
I know and trust mitmproxy. I'm warier and less likely to use a new, unknown tool that has such broad security/privacy implications. Especially these days with so many vibe-coded projects being released (no idea if that's the case here, but it's a concern I have nonetheless).
jmuncor 20 hours ago [-]
Agee! This was a fun project that I build because it is so hard to understand what "really" is in you context window... What do you mean by plugin/add-on? Add-on to what? Thinking of what to add to it next... Maybe security would be a good direction, or at least visibility of what is happening to the proxy's traffic.
Yes... Got it patched up and fixed some security issues it had. Let me know what you think of the new version.
jmuncor 18 hours ago [-]
What would you think of simply using an http relay for all providers? Would that make you feel better secutity wise? also could extend the tool to change the context you are sending and make it more granular to what you want/need...
16 hours ago [-]
EMM_386 23 hours ago [-]
This is great.
When I work with AI on large, tricky code bases I try to do a collaboration where it hands off things to me that may result in large number of tokens (excess tool calls, unprecise searches, verbose output, reading large files without a range specified, etc.).
This will help narrow down exactly which to still handle manually to best keep within token budgets.
Note: "yourusername" in install git clone instructions should be replaced.
winchester6788 17 hours ago [-]
I had a similar problem, and when claude code (or codex) is running in sandbox, i wanted to put a cap or get notified on large contexts.
especially, because once x0K words crossed, the output becomes worser.
made this mac app for the same purpose. any thoughts would be appreciated
cedws 20 hours ago [-]
I've been trying to get token usage down by instructing Claude to stop being so verbose (saying what it's going to do beforehand, saying what it just did, spitting out pointless file trees) but it ignores my instructions. It could be that the model is just hard to steer away from doing that... or Anthropic want it to waste tokens so you burn through your usage quickly.
egberts1 19 hours ago [-]
Simply assert that :
you are a professional (insert concise occupation).
Be terse.
Skip the summary.
Give me the nitty-gritty details.
You can send all that using your AI client settings.
kej 22 hours ago [-]
Would you mind sharing more details about how you do this? What do you add to your AI prompts to make it hand those tasks off to you?
jmuncor 22 hours ago [-]
Hahahah just fixed it, thank you so much!!!! Think of extending this to a prompt admin, Im sure there is a lot of trash that the system sends on every query, I think we can improve this.
Roark66 3 hours ago [-]
I use litellm (slightly modified to allow cloud code telemetry pass through) and langfuse.
There is no need for MitM, you can set Api base address to your own proxy in all the coding assistants (at least all I know - Claude Code, opencode, gemini, vc plugin).
The changes I made allow use of the models endpoint in litellm at the same base url as telemetry and passing through Claude Max auth. This is not about using your Max with another cli tool, but about recording everything that happens.
There is a tool that can send CC json logs to langfuse but the results are much inferior. You loose parts of the tool call results, timing info etc.
I'm quite happy with this. If anyone is interested I can post a github link.
david_shaw 23 hours ago [-]
Nice work! I'm sure the data gleaned here is illuminating for many users.
I'm surprised that there isn't a stronger demand for enterprise-wide tools like this. Yes, there are a few solutions, but when you contrast the new standard of "give everyone at the company agentic AI capabilities" with the prior paradigm of strong data governance (at least at larger orgs), it's a stark difference.
I think we're not far from the pendulum swinging back a bit. Not just because AI can't be used for everything, but because the governance on widespread AI use (without severely limiting what tools can actually do) is a difficult and ongoing problem.
LudwigNagasena 22 hours ago [-]
I had to vibe code a proxy to hide tokens from agents (https://github.com/vladimirkras/prxlocal) because I haven’t found any good solution either. I planned to add genai otel stuff that could be piped into some tool to view dialogues and tool calls and so on, but I haven’t found any good setup that doesn’t require lots of manual coding yet. It’s really weird that there are no solutions in that space.
dtkav 16 hours ago [-]
nice, I'm working on something similar with macroons so the tokens can be arbitrarily scopes in time and capability too.
Mine uses an Envoy sidecar on a sandbox container.
Yes, I was just thinking about how, as engineers, we're trained to document every thought that has ever crossed our minds, for liability and future reference. Yet once an LLM is done with its task, the "hit by a bus" scenario takes place immediately.
jmuncor 13 hours ago [-]
Yes, I think you can actually later store this in a database and start querying and optimizing what is happening there. Even you can start using these files or a destilation of these as long term memory.
Havoc 21 hours ago [-]
You don't need to mess with certificates - you can point CC at a HTTP endpoint and it'll happily play along.
If you build a DIY proxy you can also mess with the prompt on the wire. Cut out portions of the system prompt etc. Or redirect it to a different endpoint based on specific conditions etc.
jmuncor 21 hours ago [-]
Have you tried this with Gemini? or Codex?
Havoc 12 hours ago [-]
I personally switched to opencode. The prompt I wanted to mess with - search - I don’t need to intercept there so less need for a proxy
thehamkercat 20 hours ago [-]
Have tried with gemini-cli and claude-code both, it works, honestly, it should work with most if not all cli clients
jmuncor 18 hours ago [-]
Working on this feature right now!! Thank you for the suggestion, will start the branch for it... Whent think of improving the context window usage, now that with an http relay we can start thinking of intercepting the context window, anything that you think could be cool to implement?
jmuncor 17 hours ago [-]
Got it on the feature branch http-relay, let me know what you think!
Activate controlled folder access and filesystem access to see what is trying to change every time loading and using a llm.
Most LLM models are programmed to call home at first loading.
Then the libs you are loading them with also log and smt looking to send bytes (check with firewall for details).
HugstonOne uses Enforced Offline policy/ Offline switch because of that.
Our Users are so happy lately :) and will realize it clearly in the future.
syntaxing 14 hours ago [-]
It’s actually really easy to use mitmproxy as a…proxy. You set it up as a SOCKS proxy (or whatever) and point your network or browser to the proxy. I did this recently when a python tool was too aggressive on crawling the web and the server would reject me. Forced my session to limit 5 requests per second and it worked rather than finding the exact file to change in the library. Just do the same to your browser and then turn on the capture mode and you’ll see the requests
jmuncor 14 hours ago [-]
The idea is to simplify and store it... Thinking of changing it to http relay, what do you think?
vitorbaptistaa 17 hours ago [-]
That looks great! Any plans on allowing exports to OpenTelemetry apps like Arize Phoenix? I am looking for ways to connect my Claude Code using Max plan (no API) to it and the best I found was https://arize.com/blog/claude-code-observability-and-tracing..., but it seems kinda overweight.
cetra3 17 hours ago [-]
Yeah would love this for logfire
jmuncor 16 hours ago [-]
Something like sherlock start --otel-endpoint?
vitorbaptistaa 16 hours ago [-]
Yes. It can get a bit more complex as some otels require authentication. You can check Pydantic AI Gateway, Cloudflare AI Gateway or LiteLLM itself. They do similar things. One advantage of yours would be simplicity.
jmuncor 13 hours ago [-]
I love this idea... Going to look into it, thank you!
asyncadventure 15 hours ago [-]
This is incredibly useful for understanding the black box of LLM API calls. The real-time token tracking is game-changing for debugging why certain prompts are so expensive and optimizing context window usage. Having the markdown/JSON exports of every request makes it trivial to iterate on prompt engineering.
jmuncor 13 hours ago [-]
That is exactly the idea, later we can actually tap into the middle and optimize how the context is actually being used. Feels like the current anthropic tools like compact don't do a great job at it.
teodorasgenova 9 hours ago [-]
Curious - what pushed you toward a proxy vs adding observability/instrumentation in the code?
daxfohl 18 hours ago [-]
Pretty slick. I've been wanting something like this that gets stored with a hash that is stored in the corresponding code change commit message. It'd be good for postmortems of unnoticed hallucinations, and might even be useful to "revive" the agent and see if it can help debug the problem it created.
here is my take on the same thing, but as a mac app and using BASE_URL for intercepting codex, claude code and hooks for cursor.
maxkfranz 16 hours ago [-]
Could you use an approach like this much like a traditional network proxy, to block or sanitise some requests?
E.g. if a request contains confidential information (whatever you define that to be), then block it?
shepherdjerred 15 hours ago [-]
I do kinda the opposite where I run my AI in a sandbox. it sends dummy tokens to APIs. the proxy then injects the real creds. so, the AI never has access to creds.
Thank you, what I was thinking was more along the lines of optimizing how you use your context window. So that the LLM can actually access what it needs to, like a incredibly powerful compact that runs in the background with your file system working as a long term memory... Still thinking how to make it work, so I am super open to ideas.
FEELmyAGI 22 hours ago [-]
Dang how will Tailscale make any money on its latest vibe coded feature [0] when others can vibe code it themselves? I guess your SaaS really is someones weekend vibe prompt.
That's what LLMs enabled. Faster prototyping. Also lots of exposed servers and apps. It's never been more fun to be a cyber security researcher.
jmuncor 21 hours ago [-]
I think it just has been more fun being into computers overall!
pixl97 21 hours ago [-]
It's interesting because if you're into computers it's more accessible than ever and there are more things you can mess with more cheaply than ever. I mean we have some real science fiction stuff going on. At the same time it's probably different for the newer generations. Computers were magical to me and a lot of that was because they were rare. Now they are everywhere, they are just a backdrop to everything else going on.
jmuncor 21 hours ago [-]
I agree, I remember when the feed forward NN were the shit! And now the LLMs are owning, I think this adoption pattern will start pulling a lot of innovations on other computer science fields. Networking, for example. But the ability to have that peer programer next to you makes it so much more fun to build, when before you had to spend a whole day debugging something, Claude now just helps you out and gives you time to build. Feels like long roadtrips with cruise control and lane keeping assist!
mrbluecoat 23 hours ago [-]
So is it just a wrapper around MitM Proxy?
guessmyname 22 hours ago [-]
> So is it just a wrapper around MitM Proxy?
Yes.
I created something similar months ago [*] but using Envoy Proxy [1], mkcert [2], my own Go (golang) server, and Little Snitch [3]. It works quite well. I was the first person to notice that Codex CLI now sends telemetry to ab.chatgpt.com and other curiosities like that, but I never bothered to open-source my implementation because I know that anyone genuinely interested could easily replicate it in an afternoon with their favourite Agent CLI.
[*] In reality, I created this something like 6 years ago, before LLMs were popular, originally as a way to inspect all outgoing HTTP(s) traffic from all the apps installed in my macOS system. Then, a few months ago, when I started using Codex CLI, I made some modifications to inspect Agent CLI calls too.
tkp-415 22 hours ago [-]
Curious to see how you can get Gemini fully intercepted.
I've been intercepting its HTTP requests by running it inside a docker container with:
It was working with mitmproxy for a very brief period, then the TLS handshake started failing and it kept requesting for re-authentication when proxied.
You can get the whole auth flow and initial conversation starters using Burp Suite and its certificate, but the Gemini chat responses fail in the CLI, which I understand is due to how Burp handles HTTP2 (you can see the valid responses inside Burp Suite).
jmuncor 22 hours ago [-]
Tried with gemini and gave more headaches than anything else, would love if you can help me adding it to sherlock... I use claude and gemini, claude mainly for coding, so wanted to set it up first. With gemini, ran into the same problem that you did...
paulirish 21 hours ago [-]
Gemini CLI is open source. Don't need to intercept at the network when you can just add inspectGeminiApiRequest() in the source. (I suggest it because I've been maintaining a personal branch with exactly that :)
tkp-415 4 hours ago [-]
Ahh, that seems much simpler. Dump the request / response directly. Now I'm wondering if I can use Gemini to patch Gemini.
jmuncor 22 hours ago [-]
Kind of yes... But with a nice cli so that you don't have to set it up just run "sherlock claude" and "sherlock start" on two terminals and everything that claude sends in that session then it will be stored. So no proxy set up or anything, just simple terminal commands. :)
the_arun 19 hours ago [-]
I understand this helps if we have our own LLM run time. What if we use external services like ChatGPT / Gemini (LLM Providers)? Shouldn't they provide this feature to all their clients out of the box?
jmuncor 17 hours ago [-]
This works with claude code and codex... So you can use with any of those, you dont need a local llm running... :)
lionkor 5 hours ago [-]
Say it with me:
If I wanted an AI written tool for this, I would have prompted an AI, not opened HN.
elphard 21 hours ago [-]
This is fantastic. Claude doesn't make it easy to inspect what it's sending - which would actually be really useful for refining the project-specific prompts.
jmuncor 21 hours ago [-]
Love you like it!! Let me know any ideas to improve it... I was thining in the direction of a file system and protocol for the md files, or dynamic context building. But would love to hear what you think.
jedberg 14 hours ago [-]
Amusingly, I had the same question and asked Claude Code to vibe code me something similar. :)
jmuncor 14 hours ago [-]
Now you can add on top of it :D and we can all create something great :D
jedberg 13 hours ago [-]
As is the case with most vibe coded software, it wasn't polished, didn't work very well, had lots of edge cases, and was pretty much bespoke to my one use case. :)
It answered the question "what the heck is this software sending to the LLM" but that was about all it was good for.
jmuncor 13 hours ago [-]
That was what I wanted to answer.. hehe What edge cases can you think of, and what polish do you think I can add?
alde 7 hours ago [-]
The amount of AI slop hitting the HN front page is getting out of hand.
Then you open the comments and there are obvious LLM bots commenting on it.
Wonder if this is the end of HN.
zahlman 15 hours ago [-]
Or we could just demand agents that offer this level of introspection?
bandrami 15 hours ago [-]
I certainly wouldn't trust self-reporting on this
jmuncor 13 hours ago [-]
Not only trust, but how you later optimize what is in the context to cater how you use llms... There is a whole world to be explored inside that context window.
rgj 15 hours ago [-]
LiteLLM does this, and can do a lot more beyond that.
jmuncor 13 hours ago [-]
Sometimes simplicity is the best thing to have.
alickkk 22 hours ago [-]
Nice work! Do i need to update Claude Code config after start this proxy service?
jmuncor 22 hours ago [-]
Nope... You just run "sherlock claude" and that sets up the proxy for you. So you dont have to think about it... And just use claude normally, every prompt you send in that session will be stored in the files.
hunter-xue 14 hours ago [-]
more vibe coding tools support will be better, or capture any apps will more awesome
It shells out to mitmproxy with "--set", "ssl_insecure=true"
This took all of 5 minutes to find reading through main.py on my phone.
https://github.com/jmuncor/sherlock/blob/fb76605fabbda351828...
Edit: In case it’s not clear, you should not use this.
https://imgur.com/a/Ztyw5x5
> I built this
Instead of
> I prompted this
I am OK with people publishing new ideas to the web as long as they are 100% honest and admit they just had an idea and asked an AI to build it for themselves. That way I can understand they may not have considered all the things that needs to be considered and I can skip it (and then prompt it myself if I want to, adding all the things I consider necesary)
I wonder how much this will end up costing the industry in aggregate.
It scales very well and I guarantee this is not the only instance of misconfigured host verification. In other words, this is not as niche as you might think.
https://github.com/mitmproxy/mitmproxy/issues/2235
EDIT: Maybe I incorrectly assumed you meant authoritative DNS.
Like you pointed out in your original post, this would be expensive to run as a targeted attack, but it has good unit economics if you scale it up, wait, and then harvest.
No offense, but I wouldn’t trust anything else you published.
I think it’s great that you are learning and it is difficult to put yourself out there and publish code, but what you originally wrote had serious implications and could have caused real harm to users.
You aren't learning much. You're vibe coding, which means you learn almost nothing, except maybe prompting your LLM better.
This is not a place for you to try and make yourself feel better by disparaging others.
Also, OP claims that he is here to learn, but he is mostly chasing cheap GH stars to boost his resume. How insulting is that?
Since you asked: Not in a million years, no.
A bug of this type is either an honest typo or a sign that the author(s) don't take security seriously. Even if it were a typo, any serious author would've put a large FIXME right there when adding that line disabling verification. I know I would. In any case a huge red flag for a mitm tool.
Seeing that it's vibe coded leads me believe it's due to AI slop, not a simple typo from debugging.
I wish I had the possibilities to learn and build on such a large scale when I started out. AI is a blessing and a curse I guess.
My own early projects were most definitely crap, and I made the exact same mistakes in the past. Honestly my first attempts were surely worse. But my projects were also tiny and incomplete, so I never published them.
However: What little parts I did publish as open-source or PRs were meticulously reviewed before ever hitting send, and I knew these inside and out and they were as good as I could make it.
Vibe-coded software is complete but never as good as you could make it, so the effort in reviewing it is mostly wasted.
I guess what I'm trying to say is I'm a bit tired of seeing student-level projects on HN / Github cosplaying as production ready software built by an experienced engineer. It used to be possible to distinguish these from the README or other cues, but nowadays they all look professional and are unintentionally polluting the software space when I'm actually looking for something.
Please understand that this is not specifically directed at you, it's pent up frustration from reading HN projects over the last months. Old guy yelling at clouds.
You used to be able to tell so easily what was a good well looked after repo by viewing the effort and detail that had gone into the README.
Now it's too easy to slop up a README.
“Give me your time for free” is not the kind of request that earns respect.
I know and trust mitmproxy. I'm warier and less likely to use a new, unknown tool that has such broad security/privacy implications. Especially these days with so many vibe-coded projects being released (no idea if that's the case here, but it's a concern I have nonetheless).
https://github.com/jmuncor/sherlock/blob/fb76605fabbda351828...
When I work with AI on large, tricky code bases I try to do a collaboration where it hands off things to me that may result in large number of tokens (excess tool calls, unprecise searches, verbose output, reading large files without a range specified, etc.).
This will help narrow down exactly which to still handle manually to best keep within token budgets.
Note: "yourusername" in install git clone instructions should be replaced.
especially, because once x0K words crossed, the output becomes worser.
https://github.com/quilrai/LLMWatcher
made this mac app for the same purpose. any thoughts would be appreciated
you are a professional (insert concise occupation).
Be terse.
Skip the summary.
Give me the nitty-gritty details.
You can send all that using your AI client settings.
There is no need for MitM, you can set Api base address to your own proxy in all the coding assistants (at least all I know - Claude Code, opencode, gemini, vc plugin).
The changes I made allow use of the models endpoint in litellm at the same base url as telemetry and passing through Claude Max auth. This is not about using your Max with another cli tool, but about recording everything that happens.
There is a tool that can send CC json logs to langfuse but the results are much inferior. You loose parts of the tool call results, timing info etc.
I'm quite happy with this. If anyone is interested I can post a github link.
I'm surprised that there isn't a stronger demand for enterprise-wide tools like this. Yes, there are a few solutions, but when you contrast the new standard of "give everyone at the company agentic AI capabilities" with the prior paradigm of strong data governance (at least at larger orgs), it's a stark difference.
I think we're not far from the pendulum swinging back a bit. Not just because AI can't be used for everything, but because the governance on widespread AI use (without severely limiting what tools can actually do) is a difficult and ongoing problem.
Mine uses an Envoy sidecar on a sandbox container.
https://github.com/dtkav/agent-creds
If you build a DIY proxy you can also mess with the prompt on the wire. Cut out portions of the system prompt etc. Or redirect it to a different endpoint based on specific conditions etc.
HugstonOne uses Enforced Offline policy/ Offline switch because of that. Our Users are so happy lately :) and will realize it clearly in the future.
https://github.com/quilrai/LLMWatcher
here is my take on the same thing, but as a mac app and using BASE_URL for intercepting codex, claude code and hooks for cursor.
E.g. if a request contains confidential information (whatever you define that to be), then block it?
https://clauderon.com/ -- not really ready for others to use it though
[0]https://news.ycombinator.com/item?id=46782091
Yes.
I created something similar months ago [*] but using Envoy Proxy [1], mkcert [2], my own Go (golang) server, and Little Snitch [3]. It works quite well. I was the first person to notice that Codex CLI now sends telemetry to ab.chatgpt.com and other curiosities like that, but I never bothered to open-source my implementation because I know that anyone genuinely interested could easily replicate it in an afternoon with their favourite Agent CLI.
[1] https://www.envoyproxy.io/
[2] https://github.com/FiloSottile/mkcert
[3] https://www.obdev.at/products/littlesnitch/
[*] In reality, I created this something like 6 years ago, before LLMs were popular, originally as a way to inspect all outgoing HTTP(s) traffic from all the apps installed in my macOS system. Then, a few months ago, when I started using Codex CLI, I made some modifications to inspect Agent CLI calls too.
I've been intercepting its HTTP requests by running it inside a docker container with:
-e HTTP_PROXY=http://127.0.0.1:8080 -e HTTPS_PROXY=http://host.docker.internal:8080 -e NO_PROXY=localhost,127.0.0.1
It was working with mitmproxy for a very brief period, then the TLS handshake started failing and it kept requesting for re-authentication when proxied.
You can get the whole auth flow and initial conversation starters using Burp Suite and its certificate, but the Gemini chat responses fail in the CLI, which I understand is due to how Burp handles HTTP2 (you can see the valid responses inside Burp Suite).
If I wanted an AI written tool for this, I would have prompted an AI, not opened HN.
It answered the question "what the heck is this software sending to the LLM" but that was about all it was good for.
Wonder if this is the end of HN.
But it's in the README:
Prompt you to install it in your system trust store
build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/build/lib/sherlock