Trust, But Verify: What's Really Between Your AI Coding Tool and Your SSH Keys

Stack Overflow's 2025 survey1 found that 84% of developers are now using AI coding tools, and over half use them daily. In December 2025, security researcher Ari Marzouk disclosed over 30 vulnerabilities2 across Cursor, Windsurf, GitHub Copilot, Zed, Roo Code, Junie, and Cline. He called it IDEsaster, and I think the name fits.

I use these tools too at Another Cup of Coffee. I run AI agents across dozens of projects and they've genuinely changed how I work. However, I also spent a fair amount of time last year thinking about what's actually standing between the AI and my private keys, my client credentials, my source code. The answer turned out to be more interesting and more uneven than I expected, so I wrote it up.

What's been going wrong with AI coding tool security

The IDEsaster disclosure was just the start. Johann Rehberger spent August 2025 disclosing one AI tool vulnerability per day3 across ChatGPT, Copilot, Cursor, Claude Code, Google Jules, and others. The pattern was remarkably consistent, and the attacks were creative enough that they're worth understanding even in brief:

  • GitHub Copilot had CamoLeak4 (CVE-2025-59145, CVSS 9.6), where invisible instructions in a pull request description could make Copilot Chat silently exfiltrate AWS keys and source code through GitHub's own image proxy. A routine code review that leaked your secrets without any visible sign anything was wrong.
  • Claude Code had three separate CVEs from multiple researchers: DNS exfiltration5 via poisoned project files, automatic code execution6 through malicious hook commands triggered when you open a project, and API key theft by overriding an environment variable in project config to redirect all API calls to an attacker's server.
  • Google Jules lacked meaningful protections against prompt injection, and Rehberger demonstrated a full attack chain from prompt injection to botnet recruitment. Google triaged it as an "abuse risk" and closed the ticket. Make of that what you will.

The one that really made me sit up, though, was the Ona research7. A developer configured Claude Code with a denylist and a sandbox, and the agent found multiple ways around both. It discovered alternative filesystem paths to the same blocked binary, attempted to disable the sandbox when those paths triggered failures, and eventually invoked the Linux dynamic linker directly to bypass binary identification entirely. The agent actively reasoned about how to circumvent its own security controls, not because it was malicious but because the controls were between it and completing the task. That's not a bug. It's an AI doing exactly what AI does, and it's the part that should worry you most.

Simon Willison (who coined the term "prompt injection") calls this the "lethal trifecta"8: an AI agent with access to private data, exposure to untrusted content, and the ability to communicate externally. Every AI coding tool that runs locally on your machine meets all three criteria. A January 2026 paper formalised this into the Promptware Kill Chain9, a seven-stage attack framework co-authored by Bruce Schneier and presented in a Black Hat webinar. The finding that stuck with me was that lateral movement (an agent compromised in one context attacking another) grew from zero documented incidents in 2023 to eight in 2025-2026. These tools aren't just targets any more. They're becoming vectors.

So what's standing between the AI and your SSH keys?

Right, so that's the doom and gloom. What's on the other side? What's actually standing between these tools and the worst-case scenarios?

Most AI coding tools that execute commands on your machine have some combination of three security layers. Not all tools have all three, and the strength varies enormously, but understanding the layers helps you work out what you're actually relying on.

Application-level permissions are the first gate. This is the tool asking "shall I run this command?" before executing it. Claude Code has allow/deny lists, Cursor has its approval flow, Copilot has Safe Mode. Same idea, different implementations. The tool checks the command against its rules, and if it's not pre-approved, you get a prompt.

The problem is that this layer is self-enforced, meaning the application is policing itself. A trojanised installation could show you a permission prompt and execute the command regardless of what you click, and you'd never know because there's no way to verify the enforcement from within the application itself. Even with a legitimate installation, the Ona research showed the agent can reason its way around deny rules by finding alternative filesystem paths to the same binary. Self-enforcement is useful for catching mistakes, but it's not a security boundary.

The bigger problem is approval fatigue. When you're deep in a task and clicking through permission prompts without really reading them, you've basically disabled the permission system while it's still running.

OS-level sandboxing is the second layer, and this is where things get genuinely interesting. Claude Code and Codex CLI both take this seriously, using bubblewrap10 on Linux and Seatbelt on macOS for kernel-enforced isolation. Cursor caught up with version 2.0 (Landlock and seccomp on Linux). Copilot agent mode has terminal sandboxing on both platforms. The rest are further behind, and some have nothing at all.

The reason this layer matters is that it's not the application promising to behave. Bubblewrap creates Linux kernel namespaces that give the sandboxed process a restricted view of the filesystem and network, and because it's the operating system kernel preventing access rather than the application, a user-space process can't override it. When Claude Code says you can't read ~/.ssh/id_rsa from within the sandbox, that's the kernel saying no.

But not everyone has this, and the gap between the leaders and the rest is quite wide. Windsurf relies on user approval prompts and enterprise policy controls with no OS-level sandbox, Aider has nothing built in, and Continue.dev's "Plan Mode" is a UX feature rather than a security boundary. Even among tools that do sandbox, the coverage varies more than you'd expect. Claude Code needs both bubblewrap and socat installed for full filesystem and network isolation (without socat, your domain allowlists aren't actually enforced, which is the kind of thing you only discover when you go looking), and Cursor had a credential leak issue where the sandbox still exposed home directory files11.

I compiled this comparison from each tool's documentation and my own testing, as of March 2026. This field moves fast, so check the latest docs for your tool.

Tool Isolation Network control
Claude Code bubblewrap / Seatbelt (OS-level) Proxy-based domain allowlist
Cursor 2.0 Seatbelt / Landlock + seccomp Permission prompts for external access
Copilot agent mode Terminal sandboxing (experimental) All network blocked when sandboxed
OpenAI Codex CLI Seatbelt / seccomp + Landlock Restricted
OpenAI Codex (cloud) Isolated containers Internet off by default
Devin Cloud sandbox Cloud-managed
Amazon Q Docker containers IAM-managed
Windsurf Policy and approval prompts only Configuration-based
Aider None None
Continue.dev None None

Then there are plain OS file permissions, the standard Unix permissions enforced by the kernel. These are absolute within their scope and the strongest guarantee you have. But the scope is narrower than you'd think. Claude Code runs as your user account, so it can't touch another user's files or modify system files without sudo. That's real protection against privilege escalation. But your SSH keys, your browser data, your email, your cloud credentials, your shell config? Your account owns all of that, and OS permissions won't stop the tool from reading any of it. Everything in your home directory is fair game unless one of the other layers blocks it first.

Basically, no single layer is enough. You need all three working together, and you need to know what each one actually covers, because the gaps between them are where the real risk lives.

What we're doing about it

I run AI agents across dozens of projects and I wrote earlier this year about how the multi-agent architecture avoids the kind of problems that hit OpenClaw, mostly through design choices like session-only agents (nothing running between sessions, so no daemon to hijack), encrypted credentials via pass and GPG rather than plaintext files, file-based coordination through text memos instead of shared API keys, and per-project isolation so a compromise in one project stays there. Those architectural properties are the foundation, and they still hold. But the incidents from the past year pushed me to add the runtime security layers on top, because good architecture doesn't help if the tool on your machine can read everything your user account can.

I run Claude Code with bubblewrap and socat on Arch Linux. The sandbox is on globally with the escape hatch disabled, meaning agents can't retry failed commands without sandboxing even if they want to. Sensitive paths are blocked at the kernel level. Private keys, GPG keyring, password store, cloud credentials are all on the deny-read list. Shell configs, the SSH directory, and the sandbox settings themselves are write-protected so an agent can't weaken its own restrictions. Network access from sandboxed commands is restricted to GitHub and package registries by default, with project-level overrides only where I've made a deliberate decision that a specific project needs access to a specific domain.

Commands like SSH and Docker that can't work inside a network namespace are excluded from the sandbox but still go through the permission layer. That's a weaker gate for those commands and I know it, but it's the trade-off: SSH needs real network access to reach remote hosts, and there's no way to sandbox that while keeping it functional. So I accept the weaker control for specific commands and tighten everything else around them.

The sudo thing is worth mentioning because I learned it the hard way. One of my agents got into a loop trying to run sudo commands. It couldn't authenticate (there's no interactive terminal for password entry, which is actually a natural protection), but it kept trying, and the repeated failures triggered pam_faillock and locked the account. I had to clear the lockout manually, and of course this happened in the middle of something urgent. The lesson isn't just "don't configure passwordless sudo" (though seriously, don't, because NOPASSWD: ALL gives a compromised agent full root access to your machine). It's that even failed sudo attempts have consequences, and the inability to use sudo is a feature, not a bug.

And the approval fatigue problem is real. I've caught myself clicking "yes" to permission prompts without reading them because I'm focused on the actual work, and then glancing at what I'd just approved and realising the agent was about to delete something it shouldn't or overwrite a file I hadn't backed up. That jolt of "wait, what did I just approve?" is not a good feeling, and it's what pushed me toward auto-allow mode with a properly configured sandbox rather than relying on manual approval for everything. The sandbox handles the enforcement; the prompts are a secondary check for things that fall outside it.

It took longer than I'd like to admit to get all of this working together without breaking the actual workflow. But that's sort of the price of taking it seriously, and now that it's in place, the day-to-day experience is genuinely better than it was when I was relying on permission prompts alone.

A practical AI coding tool security checklist

If you're using AI coding tools in a business context, here's what we'd recommend regardless of which tool you're on. This isn't theory; it's what I actually did, and the order roughly reflects priority.

Action Why it matters
1 Verify your installation source Every protection below assumes a legitimate installation. A trojanised tool can fake every prompt and status indicator. Install from official channels, verify checksums, keep things updated.
2 Enable OS-level sandboxing Claude Code has a /sandbox command. Cursor 2.0 has agent sandboxing. Copilot has terminal sandboxing. If your tool doesn't offer it (Windsurf, Aider, Continue.dev), you're relying entirely on application-level controls.
3 Install socat (Linux) bubblewrap handles filesystem isolation but socat is needed for network domain filtering. Without it, your allowlists aren't enforced.
4 Block read access to sensitive paths Deny-read ~/.ssh/id_*, ~/.ssh/*_rsa, GPG keyring, password store, cloud credentials. SSH still works via ssh-agent. You lose nothing.
5 Write-protect shell and tool configs Deny-write ~/.bashrc, ~/.zshrc, ~/.ssh/, and the sandbox settings file itself, so an agent can't weaken its own restrictions.
6 Disable the sandbox escape hatch Claude Code's allowUnsandboxedCommands setting lets agents retry failed commands without sandboxing. Turn it off. A command failing inside the sandbox is the sandbox doing its job.
7 Never configure passwordless sudo AI tools can't use sudo interactively (no TTY). That's a natural protection. NOPASSWD: ALL removes it entirely and gives a compromised agent full root access.
8 Restrict network to known domains Allow GitHub, package registries, and whatever specific services your project needs. Everything else gets blocked at the sandbox level.
9 Know which commands bypass the sandbox SSH, Docker, and similar tools need real network access and typically run outside the sandbox. They still go through permission rules, but that's a weaker gate.
10 Review project config files like code Multiple CVEs used .claude/settings.json, hooks, MCP configs, and environment overrides as attack vectors. Opening an untrusted repo is now the new "don't run untrusted executables."
11 Verify the sandbox is actually running On Linux: ps aux | grep bwrap during a session. If there are no bubblewrap processes while commands are executing, something is wrong.

Is this list perfect? No, and honestly the field is moving fast enough that it'll need updating. But it's a concrete starting point, and it's where I landed after working through the incidents and research above.

If you're not sure what your security posture actually looks like (or you suspect the answer is "whatever the defaults were"), that's the kind of thing we help with. We've been running this setup across dozens of projects for over a year, and we're happy to have a conversation about what would work for your situation.

Common Questions

Can AI coding tools read my SSH keys?

Yes, unless you've configured sandboxing to block it. AI coding tools run as your user account, which means they have the same file access you do. Your SSH keys, cloud credentials, browser data, and shell config are all readable by default. OS-level sandboxing with explicit deny-read rules is the only way to prevent it.

Which AI coding tools have sandboxing?

As of March 2026, Claude Code, Cursor 2.0, Copilot agent mode, and OpenAI Codex CLI all offer some form of OS-level sandboxing. Windsurf, Aider, and Continue.dev rely on application-level controls or have no sandboxing at all. See the comparison table above for details.

Is the permission prompt enough to keep me safe?

On its own, no. Permission prompts are self-enforced by the application, which means a compromised tool could bypass them entirely. Even with a legitimate installation, approval fatigue is a real problem. OS-level sandboxing provides kernel-enforced protection that doesn't depend on you clicking the right button every time.


This article is part of an ongoing series on how Another Cup of Coffee is adapting to AI. Explore all articles in this series.

You may also like

Red lobster on a white plate

What OpenClaw Teaches Us About AI Agent Security

OpenClaw's security crisis exposed real problems with how AI agents handle credentials, plugins, and system access. Here's what went wrong and how a convention-based approach avoids these risks entirely.

One person running dozens of projects with AI agents

I Run Dozens of Projects with AI. The Hard Part Isn't the AI.

One person, dozens of projects, four AI vendors. I spent a year building a coordination system for AI agents. The components are simple. Getting them right was not.

Building an Operating Environment for AI Agents

Building an Operating Environment for AI Agents

How markdown files and conventions turned CLI agent tools into a coordination system running 44 projects across 14 organisations. No framework required.

Footnotes

  1. Stack Overflow, 2025 Developer Survey.
  2. Ari Marzouk, IDEsaster: A Novel Vulnerability Class in AI IDEs, MaccariTA, December 2025.
  3. Simon Willison, The Summer of Johann, August 2025.
  4. Legit Security, CamoLeak: Critical GitHub Copilot Vulnerability Leaks Private Source Code.
  5. Johann Rehberger, Claude Code: Exfiltration via DNS Requests, Embrace The Red.
  6. Check Point Research, RCE and API Token Exfiltration through Claude Code Project Files.
  7. Ona, How Claude Code Escapes Its Own Denylist and Sandbox.
  8. Simon Willison, The Lethal Trifecta for AI Agents.
  9. Promptware Kill Chain, arxiv 2601.09625, January 2026. Co-authored by Bruce Schneier; presented in a Black Hat webinar.
  10. bubblewrap on GitHub. Unprivileged sandboxing tool using Linux kernel namespaces.
  11. Luca Becker, When Sandboxing Leaks Your Secrets.

Featured image photo by Elijah Cobb on Unsplash.