Claude's willingness to poke outside of its present directory can definitely be a little worrying. Just the other day, it started trying to access my jails after I specifically told it not to.
On a Mac, I use built-in sandboxing to jail Claude (and every other agent) to $CWD so it doesn’t read/write anything it shouldn’t, doesn’t leak env, etc. This is done by dynamically generating access policies and I open sourced this at https://agent-safehouse.dev
By any chance, do you know what Claude Code's sandbox feature uses under the hood and how that relates to your solution ? From what I remember it also uses the native MacOS sandbox framework, but I haven't looked too deep into it and don't trust it fully
Claude Code sandboxing uses the same basic OS primitive but grants read access to the entire filesystem and includes escape hatches (some commands bypass sandboxing). Also, I wanted something solid I can use to limit every agent (OpenCode, Pi, Auggie, etc).
I haven’t found an easy way, but I have a working theory -
sandbox-exec cannot filter based on domain names, but it can restrict outbound network connections to a specific IP/port (and drop the rest). If I can run a proxy on localhost:19999, I can allow agents to connect through it and filter connections by hostname. From my research, most agents support $HTTP_PROXY, so I'll try redirecting their HTTP requests through my security proxy. IIRC, if I do this at the CONNECT level, I don't need to MITM their traffic nor require a trusted root cert.
Recently, Codex CLI implemented something like DNS filtering for their sandbox, so I'd investigate their repo.
Some commercial firewalls will snoop on the SNI header in TLS requests and send a RST towards the client if the hostname isn’t on a whitelist. Reasonably effective. If there’s a way with the macos sandboxing to intercept socket connections you might find some proxy software that already supports this.