<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Focus on your mission, not your tech - Another Cup of Coffee</title><link>https://anothercoffee.net/</link><description>We handle the digital complexity for small businesses &amp; charities.</description><atom:link href="https://anothercoffee.net/rss.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Copyright © 2006 - 2026 &lt;a href="https://anothercoffee.net/" title="Another Cup of Coffee Limited"&gt;Another Cup of Coffee Limited&lt;/a&gt; </copyright><lastBuildDate>Fri, 27 Mar 2026 00:20:51 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>The Practical Guide to Locking Down Claude Code</title><link>https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/</link><dc:creator>Anthony Lopez-Vito</dc:creator><description>&lt;figure&gt;&lt;img src="https://anothercoffee.net/images/posts/locked-down-claude-code-og-1200x630.jpg"&gt;&lt;/figure&gt; &lt;p class="intro"&gt;We use Claude Code across dozens of projects at Another Cup of Coffee. It's genuinely changed how we work but these tools run as your user account with access to your entire home directory. Trusting them with full autonomy is a mistake. This guide covers the layered configuration I built to lock mine down.&lt;/p&gt;

&lt;p&gt;I wrote recently about &lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/"&gt;what's standing between AI coding tools and your SSH keys&lt;/a&gt;. That piece covered the threats, the security layers, and a checklist of things you should be doing. This is the practical follow-up where I walk through the security configuration I've built for Claude Code, and include examples that might be useful for your own setup. It's Claude Code-specific, but the principles apply to any AI tool that runs commands as your user account.&lt;/p&gt;
&lt;p&gt;You can read the &lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/"&gt;companion article&lt;/a&gt; for a background on why these protections matter, but if you want to set them up, this guide will help.&lt;/p&gt;
&lt;div class="p-4 border bg-light mb-4"&gt;
&lt;p class="mb-0"&gt;&lt;i class="fa fa-exclamation-triangle fa-lg" aria-hidden="true" style="color: #e6a23c;"&gt;&lt;/i&gt; &lt;strong&gt;A word of caution.&lt;/strong&gt; This is experimental work. The security tooling for AI coding agents is still immature and the configurations below are what's working for me right now, not a finished product. Test everything in your own environment before relying on it. If you find issues or improvements, I'd genuinely like to &lt;a href="https://anothercoffee.net/contact/"&gt;hear about them&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;

&lt;h2 id="table-of-contents"&gt;Table of Contents&lt;/h2&gt;
&lt;div class="toc"&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#table-of-contents"&gt;Table of Contents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#key-terms-for-claude-code-security"&gt;Key terms for Claude Code security&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#the-os-layer-you-already-have"&gt;The OS layer you already have&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#what-youll-need"&gt;What you'll need&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#the-sandbox-path"&gt;The sandbox path&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#why-not-the-sandbox-for-ai-agent-workflows"&gt;Why not the sandbox for AI agent workflows?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#the-design-principle-layered-claude-code-security"&gt;The design principle: layered Claude Code security&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#layer-1-deny-permissions-in-settingsjson"&gt;Layer 1: deny permissions in settings.json&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#keep-credentials-out-of-context"&gt;Keep credentials out of context&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#the-sudo-convention"&gt;The sudo convention&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#layer-2-the-bash-guard-hook"&gt;Layer 2: the Bash guard hook&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#why-simple-text-matching-for-the-hook-script"&gt;Why simple text matching for the hook script?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#per-project-exemptions"&gt;Per-project exemptions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#registering-the-hook"&gt;Registering the hook&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#testing-your-security-setup"&gt;Testing your security setup&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#whats-next-for-locking-down-claude-code"&gt;What's next for locking down Claude Code&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id="key-terms-for-claude-code-security"&gt;Key terms for Claude Code security&lt;/h2&gt;
&lt;p&gt;Let's start by defining some key terms so we're clear about what we'll be referencing.&lt;/p&gt;
&lt;p&gt;If you're new to Claude Code, a few terms come up repeatedly and they're worth defining upfront because some of them mean different things in different contexts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Permissions&lt;/strong&gt; (Claude Code level) are rules in &lt;code&gt;~/.claude/settings.json&lt;/code&gt; that control which tools the AI agent can use and which files it can access. These are enforced by the Claude Code application itself and to be clear, we're not talking about Unix file permissions. OS file permissions are enforced by the operating system kernel. Both matter, and they protect different things at different layers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The sandbox&lt;/strong&gt; is Claude Code's built-in OS-level isolation but it relies on operating system utilities, like bubblewrap on Linux and Seatbelt on macOS. When the sandbox is on, Bash commands run inside a restricted kernel namespace with limited filesystem and network access. The sandbox wraps Bash commands only; it doesn't affect Claude Code's own Write or Edit tools.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hooks&lt;/strong&gt; are scripts that Claude Code runs automatically before or after certain actions. A PreToolUse hook runs before a tool call executes, and it can block the call by returning a deny decision. This guide uses a PreToolUse hook on the Bash tool to inspect commands before they run. Hooks are your code, running on your machine, triggered by Claude Code's event system.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OS-level permissions&lt;/strong&gt; are the standard Unix file permissions (owner, group, other) enforced by the Linux or macOS kernel. Claude Code runs as your user account, so it inherits whatever access your account has. This means it can read your SSH keys, your browser data, your shell config, and anything else your account owns, unless something else (sandbox, deny rules, hooks) blocks it first.&lt;/p&gt;
&lt;h2 id="the-os-layer-you-already-have"&gt;The OS layer you already have&lt;/h2&gt;
&lt;p&gt;It's important to understand what the operating system already gives you before configuring anything in Claude Code. Claude Code runs as your user account (whichever account you use to run it), which means standard Unix permissions are the first line of defence.&lt;/p&gt;
&lt;p&gt;I run Claude Code under a dedicated user account, separate from my day-to-day login. This gives you real kernel-enforced isolation: the agent can't read your personal documents, SSH keys, or application configurations because those files belong to a different user. Unix permissions won't let it cross accounts and it's the strongest single thing you can do. The rest of this guide applies whether you do this or not. If you run Claude Code as your own user, the deny rules and hooks described below are doing more of the heavy lifting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What this protects:&lt;/strong&gt; System files owned by root are already protected from modification by your user account. Your agent can't write to &lt;code&gt;/etc/passwd&lt;/code&gt; or replace &lt;code&gt;/usr/bin/ssh&lt;/code&gt; because your account doesn't have write permission to those locations. Some system files are also protected from reading: &lt;code&gt;/etc/shadow&lt;/code&gt; on Linux (which stores password hashes) is typically mode 000 or 640, so a non-root process can't read it at all. But most system files (like &lt;code&gt;/etc/passwd&lt;/code&gt;, everything in &lt;code&gt;/usr/bin/&lt;/code&gt;) are world-readable, just not world-writable. This is the kernel enforcing access control, and no amount of prompt injection or agent reasoning can bypass it. (macOS doesn't use &lt;code&gt;/etc/shadow&lt;/code&gt;. Instead it stores credentials in a separate system database that's similarly protected.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What this doesn't protect:&lt;/strong&gt; Everything your user account owns. That includes &lt;code&gt;~/.ssh/&lt;/code&gt; (your private keys), &lt;code&gt;~/.gnupg/&lt;/code&gt; (your GPG keyring), &lt;code&gt;~/.aws/credentials&lt;/code&gt;, &lt;code&gt;~/.bashrc&lt;/code&gt;, your browser profile, your email, and everything in your home directory. From the kernel's perspective, Claude Code reading your SSH private key is identical to you reading it yourself, because it's running as you.&lt;/p&gt;
&lt;p&gt;If you're running Claude Code as your own user, you can tighten things on Linux by setting restrictive permissions on sensitive directories: &lt;code&gt;chmod 700 ~/.ssh ~/.gnupg ~/.password-store&lt;/code&gt; ensures only your user can access them. Claude Code still can (it's running as you, remember) but it limits exposure from other accounts on the machine.&lt;/p&gt;
&lt;p&gt;On macOS, the same principles apply, with the addition of TCC (Transparency, Consent, and Control). macOS protects certain directories (Desktop, Documents, Downloads) behind a consent system. The first time a process tries to access one of these, macOS shows a prompt, but it's attributed to the terminal emulator (Terminal.app or iTerm2), not to the child process. Once you grant your terminal access to a protected folder (or grant it Full Disk Access), every process it spawns, including Claude Code and its scripts, inherits that access silently. If your terminal already has Full Disk Access, TCC won't provide any additional protection. If it doesn't, consider whether it needs it because granting FDA to your terminal is the same as granting it to every CLI tool you run.&lt;/p&gt;
&lt;p&gt;The rest of this guide builds on top of this OS layer. The Claude Code permissions, sandbox, and hooks are all additional controls that restrict what the agent can do within the access your user account already has.&lt;/p&gt;
&lt;h2 id="what-youll-need"&gt;What you'll need&lt;/h2&gt;
&lt;p&gt;This guide assumes you're running Claude Code on Linux or macOS, and the deny permissions and hook registration work the same on both platforms. The hook script needs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Python 3&lt;/strong&gt; (the hook shells out to Python for JSON parsing). On Linux, this is almost certainly already installed. On macOS, install via Xcode Command Line Tools (&lt;code&gt;xcode-select --install&lt;/code&gt;) or Homebrew.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;bash&lt;/strong&gt; (the hook wrapper uses bash, not sh). It ships with both Linux and macOS.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;syslog access&lt;/strong&gt; for audit logging. On Linux, logs go to the system journal and you can read them with &lt;code&gt;journalctl&lt;/code&gt;. On macOS, the Python &lt;code&gt;syslog&lt;/code&gt; module writes to the unified log, which you can check with &lt;code&gt;log show --predicate 'process == "python3"' --last 5m&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you want to use the sandbox instead of, or alongside, the hook approach, you'll also need:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Linux:&lt;/strong&gt; bubblewrap&lt;sup&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#fn2"&gt;2&lt;/a&gt;&lt;/sup&gt; and socat. On Arch: &lt;code&gt;pacman -S bubblewrap socat&lt;/code&gt;. On Debian/Ubuntu: &lt;code&gt;apt install bubblewrap socat&lt;/code&gt;. Your kernel needs unprivileged user namespaces enabled, which most modern distros have by default.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;macOS:&lt;/strong&gt; Seatbelt is built into the OS. No additional packages needed. socat is not required on macOS because Claude Code uses Seatbelt's native network filtering.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="the-sandbox-path"&gt;The sandbox path&lt;/h2&gt;
&lt;p&gt;Before I get into the hook-based approach, I should be clear that if your workflow doesn't involve SSH, rsync, or deleting files outside your project directory, the sandbox is the better option. It's a single configuration block that's kernel-enforced, and it handles both filesystem and network isolation.&lt;/p&gt;
&lt;p&gt;This is an example sandbox configuration for &lt;code&gt;~/.claude/settings.json&lt;/code&gt;. Adjust the paths and domains to match your setup:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;"sandbox"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;"autoAllowBashIfSandboxed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;"allowUnsandboxedCommands"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;"filesystem"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;"allowWrite"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"~/Projects"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"//tmp"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;"denyWrite"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;"//etc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"//usr"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"//boot"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;"~/.bashrc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.zshrc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.bash_profile"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.profile"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;"~/.claude/settings.json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.msmtprc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.mbsyncrc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;"~/.gnupg"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.ssh"&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;"denyRead"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;"~/.ssh/id_*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.ssh/*_rsa"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;"~/.gnupg/private-keys-v1.d"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;"~/.password-store"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.aws/credentials"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.kube/config"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;"~/.bash_history"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.zsh_history"&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;"network"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;"allowedDomains"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;"github.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"raw.githubusercontent.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;"objects.githubusercontent.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="s2"&gt;"registry.npmjs.org"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pypi.org"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"files.pythonhosted.org"&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here are the key settings.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;enabled: true&lt;/code&gt; turns on bubblewrap/Seatbelt isolation. &lt;code&gt;autoAllowBashIfSandboxed: true&lt;/code&gt; means Bash commands inside the sandbox run without asking for approval. The sandbox is doing the enforcement, so the approval prompt is redundant. &lt;code&gt;allowUnsandboxedCommands: false&lt;/code&gt; disables the escape hatch that lets agents retry failed commands without sandboxing. That last one matters because a command failing inside the sandbox means the sandbox is doing its job. The agent shouldn't be able to turn it off.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;denyRead&lt;/code&gt; on SSH keys does not break SSH. ssh-agent handles authentication through a Unix socket, not by reading key files directly. Your agent can still &lt;code&gt;ssh&lt;/code&gt; into things if the sandbox allows network access to that host. It just can't read the private key file itself.&lt;/p&gt;
&lt;p&gt;Add project-specific domains to &lt;code&gt;allowedDomains&lt;/code&gt; as you need them. For example, if a project needs to reach an API or a staging server over HTTPS, you should add that domain. The network allowlist only affects sandboxed Bash commands, and WebFetch and WebSearch are controlled separately through the permissions layer.&lt;/p&gt;
&lt;p&gt;You can still layer on extra deny permissions and hooks on top of the sandbox because they don't conflict. In fact, the extra layers catch anything the sandbox doesn't cover, like a malicious Write tool call, which the sandbox doesn't see because it only wraps Bash. Personally, I'd recommend the deny permissions from Layer 1 below even if you're using the sandbox.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The rest of this article is for when the sandbox doesn't fit.&lt;/strong&gt; If you need SSH, rsync, raw TCP, or file deletion outside your project directory, read on because the sandbox is going to make things unworkable for you.&lt;/p&gt;
&lt;h2 id="why-not-the-sandbox-for-ai-agent-workflows"&gt;Why not the sandbox for AI agent workflows?&lt;/h2&gt;
&lt;p&gt;A sandbox is a good choice for a lot of workflows like editing code from someone's git repo. You should use it if it works for yours because it's a simpler setup than what I'm about to describe.&lt;/p&gt;
&lt;p&gt;A lot of my work involves &lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/"&gt;running AI agents that manage remote servers&lt;/a&gt;. This requires logging in via SSH, rsync-ing files between machines, and deleting temporary artifacts. The sandbox is too restrictive because it blocks everything that makes these agents useful. For example:
- TCP networking is completely blocked so SSH doesn't work;
- file deletion is blocked even on paths you've explicitly whitelisted so files build up;
- and there's no way to allow SSH to one host while blocking another.&lt;/p&gt;
&lt;p&gt;I ended up having to replace the sandbox with two layers that offer the necessary protections.&lt;/p&gt;
&lt;h2 id="the-design-principle-layered-claude-code-security"&gt;The design principle: layered Claude Code security&lt;/h2&gt;
&lt;p&gt;The approach goes like this:&lt;/p&gt;
&lt;table class="table table-bordered mt-4 mb-4"&gt;
&lt;thead&gt;
&lt;tr&gt;&lt;th&gt;Tool&lt;/th&gt;&lt;th&gt;Protection&lt;/th&gt;&lt;th&gt;Mechanism&lt;/th&gt;&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;Write / Edit&lt;/td&gt;&lt;td&gt;Deny permissions in settings.json&lt;/td&gt;&lt;td&gt;Built-in, zero code (see caveat below)&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Read&lt;/td&gt;&lt;td&gt;Allow-list scoped to ~/Projects/**&lt;/td&gt;&lt;td&gt;Built-in scoping&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Bash&lt;/td&gt;&lt;td&gt;PreToolUse hook (substring match)&lt;/td&gt;&lt;td&gt;Heuristic, fail-closed, honest about its limits&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;These are all Claude Code-level controls, sitting on top of the OS-level protections described earlier. The OS layer, like Unix file permissions, protects system files from modification by your user account. The Claude Code layers protect everything your user account &lt;em&gt;can&lt;/em&gt; access but the agent &lt;em&gt;shouldn't&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The idea is that you don't use hooks where permissions work. Claude Code's deny permissions are enforced by the application itself. Hooks, on the other hand, are for Bash because the permission system can check the tool name but can't inspect what's inside the command.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;The agent shouldn't be able to turn off the thing that's restricting it.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;h2 id="layer-1-deny-permissions-in-settingsjson"&gt;Layer 1: deny permissions in settings.json&lt;/h2&gt;
&lt;p&gt;Claude Code has a permission system&lt;sup&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#fn1"&gt;1&lt;/a&gt;&lt;/sup&gt; with allow, deny, and ask rules. The important thing about deny rules is that they're evaluated first and nothing can get around them. This means that if you deny something at the global level (&lt;code&gt;~/.claude/settings.json&lt;/code&gt;), no project-level configuration can override it. This is enforced by Claude Code itself, not by the operating system.&lt;/p&gt;
&lt;p&gt;See below for an example deny block. Make sure you replace &lt;code&gt;/home/youruser&lt;/code&gt; with your actual home directory, and adjust the protected paths to match what's on your system:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;"permissions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;"deny"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(msmtp *)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(sendmail *)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(mail *)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//etc/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//usr/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//boot/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//sbin/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//lib/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//home/youruser/.bashrc)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//home/youruser/.zshrc)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//home/youruser/.profile)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//home/youruser/.bash_profile)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//home/youruser/.msmtprc)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//home/youruser/.mbsyncrc)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//home/youruser/.gnupg/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//home/youruser/.ssh/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Write(//home/youruser/.claude/settings.json)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//etc/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//usr/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//boot/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//sbin/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//lib/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//home/youruser/.bashrc)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//home/youruser/.zshrc)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//home/youruser/.profile)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//home/youruser/.bash_profile)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//home/youruser/.msmtprc)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//home/youruser/.mbsyncrc)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//home/youruser/.gnupg/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//home/youruser/.ssh/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Edit(//home/youruser/.claude/settings.json)"&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You'll notice every Write and Edit rule is duplicated. This is important because they're actually different tools, so an agent could use either to modify a file. Both need blocking.&lt;/p&gt;
&lt;p&gt;What's protected, and why:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;System directories&lt;/strong&gt; (&lt;code&gt;/etc&lt;/code&gt;, &lt;code&gt;/usr&lt;/code&gt;, &lt;code&gt;/boot&lt;/code&gt;, &lt;code&gt;/sbin&lt;/code&gt;, &lt;code&gt;/lib&lt;/code&gt;) cover OS configuration, installed packages, bootloader, and system binaries. An agent shouldn't write to any of these, unless it's one specifically deployed for administering the environment.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Shell configs&lt;/strong&gt; (&lt;code&gt;.bashrc&lt;/code&gt;, &lt;code&gt;.zshrc&lt;/code&gt;, &lt;code&gt;.profile&lt;/code&gt;, &lt;code&gt;.bash_profile&lt;/code&gt;) are needed because if an agent writes something malicious here, it persists after the session ends.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SSH and GPG directories&lt;/strong&gt; (&lt;code&gt;.ssh&lt;/code&gt;, &lt;code&gt;.gnupg&lt;/code&gt;) protect your keys and trust chain. SSH still works fine for the agent via ssh-agent as it doesn't need direct file access.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mail configs&lt;/strong&gt; (&lt;code&gt;.msmtprc&lt;/code&gt;, &lt;code&gt;.mbsyncrc&lt;/code&gt;) because these contain SMTP credentials and mail server access.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Claude Code's own settings&lt;/strong&gt; (&lt;code&gt;.claude/settings.json&lt;/code&gt;) because if an agent can modify its own permission rules, a prompt injection could disable every other protection in this list.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;A caveat on reliability.&lt;/strong&gt; In theory, deny permissions are the strongest layer but there have been bugs where deny rules for Read, Write, and Edit were silently ignored.&lt;sup&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#fn3"&gt;3&lt;/a&gt;&lt;/sup&gt; This is exactly why this guide layers hooks on top of permissions. The deny rules should work, but if they don't, the hook hopefully catches it. While you can't anticipate every problem, a defence in depth mindset protects you from most things.&lt;/p&gt;
&lt;p&gt;The allow list is the other side of this setup. You can scope Read access to a specific directory (like &lt;code&gt;~/Projects/**&lt;/code&gt;) so agents can't browse the rest of your home directory, and pre-approve common read-only Bash commands (e.g. &lt;code&gt;ls&lt;/code&gt;, &lt;code&gt;git&lt;/code&gt;, &lt;code&gt;grep&lt;/code&gt;, &lt;code&gt;cat&lt;/code&gt;, &lt;code&gt;diff&lt;/code&gt;) to reduce approval fatigue. I say reduce because approval fatigue is still a thing, unfortunately. The practical trade-off is to auto-allow the safe stuff so you're not clicking "yes" a hundred times a day...maybe just 99. Experienced Claude users know what I mean.&lt;/p&gt;
&lt;p&gt;Here's an example allow block showing the pattern:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;"permissions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;"allow"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Read(//home/youruser/Projects/**)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(ls:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(git -C /:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(grep:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(cat:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(diff:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(cp:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(mkdir:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"WebSearch"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="s2"&gt;"WebFetch"&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You'll notice &lt;code&gt;cat&lt;/code&gt; and &lt;code&gt;cp&lt;/code&gt; are auto-approved here even though they could technically read or copy sensitive files.&lt;sup&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#known-limitations"&gt;*&lt;/a&gt;&lt;/sup&gt; This is where the layers work together: the hook script in Layer 2 catches any &lt;code&gt;cat&lt;/code&gt; or &lt;code&gt;cp&lt;/code&gt; command that references a protected path like &lt;code&gt;~/.ssh/&lt;/code&gt;, even if the permission system has already auto-approved it. The allow rule lets the command skip the approval prompt, but the hook still inspects it.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;Bash(git -C /:*)&lt;/code&gt; pattern is worth explaining because I came across commit approval problems early on when using git with my agents. The &lt;code&gt;-C&lt;/code&gt; flag tells git to operate on a specific directory, so &lt;code&gt;git -C /home/youruser/Projects/myproject status&lt;/code&gt; works without the agent needing to &lt;code&gt;cd&lt;/code&gt; anywhere. Pre-approving this pattern means git commands run without prompting the user, but only when they include an explicit path. &lt;code&gt;Bash(git:*)&lt;/code&gt; without the &lt;code&gt;-C&lt;/code&gt; would also match bare &lt;code&gt;git&lt;/code&gt; commands, making it run on whatever directory the agent happens to be in. This means git could act on a repo you didn't intend so the &lt;code&gt;-C&lt;/code&gt; pattern forces it to be explicit.&lt;/p&gt;
&lt;h3 id="keep-credentials-out-of-context"&gt;Keep credentials out of context&lt;/h3&gt;
&lt;p&gt;There's also a broader principle we need to keep in mind. Credentials should almost never appear in your agent's conversation context because if a secret shows up in a tool output or a file the agent reads, it's in the context window. From there it could end up somewhere else, like a log or a commit message.&lt;/p&gt;
&lt;p&gt;The solution for SSH is straightforward. Just use ssh-agent. The agent runs &lt;code&gt;ssh myserver&lt;/code&gt; and ssh-agent handles authentication through a Unix socket. The private key never enters Claude Code's context. This makes your deny rules on &lt;code&gt;~/.ssh/&lt;/code&gt; the backup, while ssh-agent is the primary protection.&lt;/p&gt;
&lt;p&gt;For database credentials, you can store them server-side in standard locations. For example, MySQL/MariaDB reads from &lt;code&gt;~/.my.cnf&lt;/code&gt; on the server and PostgreSQL reads from &lt;code&gt;~/.pgpass&lt;/code&gt;. The agent runs &lt;code&gt;ssh myserver "mysql -e 'SHOW DATABASES'"&lt;/code&gt; and the credentials are loaded by the server rather than by the agent.&lt;/p&gt;
&lt;p&gt;For API keys and cloud credentials, it helps to have environment variables loaded from a protected file, like &lt;code&gt;pass&lt;/code&gt; with GPG, or a &lt;code&gt;.env&lt;/code&gt; file in a directory the agent can't read. The agent can then call the API through a wrapper script that sources the credentials at runtime.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;If a secret shows up in a tool output or a file the agent reads, it's in the context window. From there it could end up somewhere else.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;In essence, wherever possible, you keep credentials somewhere outside of the agent's reach and instead, have the agent invoke a command that uses credentials indirectly. If you want to understand why this matters beyond convenience, the &lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/"&gt;OpenClaw incident&lt;/a&gt; is a good case study in what happens when credentials end up where they shouldn't.&lt;/p&gt;
&lt;h3 id="the-sudo-convention"&gt;The sudo convention&lt;/h3&gt;
&lt;p&gt;There's one more convention-level control I can cover. AI coding tools can't use &lt;code&gt;sudo&lt;/code&gt; interactively (unless you've configured passwordless sudo) because there's no terminal for password entry. That's a great natural protection, but an agent will indeed try, and keep trying. If you're on Linux, it will eventually trigger &lt;code&gt;pam_faillock&lt;/code&gt;, and the repeated authentication failures end up locking you out of sudo until the timeout expires. I learned this the hard way when an agent silently got into a retry loop, and eventually locked me out of my own machine at a very inconvenient time.&lt;/p&gt;
&lt;p&gt;The fix is an instruction in each agent's configuration file. Something like: "Never run &lt;code&gt;sudo&lt;/code&gt; via the Bash tool. Instead, give the user the command to run themselves." Most experienced AI agent users will know "Never run..." is never really never. But that's the best you can do for now.&lt;/p&gt;
&lt;p&gt;It's not so bad on macOS because repeated failed &lt;code&gt;sudo&lt;/code&gt; attempts won't lock the account unless you've configured it to do so. However, an agent will waste time retrying and end up flooding the security log.&lt;/p&gt;
&lt;h2 id="layer-2-the-bash-guard-hook"&gt;Layer 2: the Bash guard hook&lt;/h2&gt;
&lt;div class="p-4 border bg-light mb-4"&gt;
&lt;p class="mb-0"&gt;&lt;i class="fa fa-exclamation-triangle fa-lg" aria-hidden="true" style="color: #e6a23c;"&gt;&lt;/i&gt; &lt;strong&gt;This hook is a deterrent, not a security boundary.&lt;/strong&gt; It catches common commands that reference protected paths but it can be bypassed by an agent that constructs paths indirectly. It's one layer in a defence-in-depth setup, not a standalone protection.&lt;/p&gt;
&lt;/div&gt;

&lt;p&gt;Deny permissions can block &lt;code&gt;Write(//etc/**)&lt;/code&gt;, but they can't block &lt;code&gt;Bash(cat /etc/shadow)&lt;/code&gt;. The permission system checks which tool is being called and (for Write/Edit) which file path is targeted. It doesn't inspect the contents of a Bash command though, so that's why we need the hook script.&lt;/p&gt;
&lt;p&gt;Here's an example:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="ch"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="c1"&gt;# PreToolUse hook: Guard sensitive system paths and secrets from Bash commands.&lt;/span&gt;
&lt;span class="c1"&gt;# Write/Edit protection is handled by deny permissions in ~/.claude/settings.json.&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# Input: JSON on stdin with .tool_input.command field&lt;/span&gt;
&lt;span class="c1"&gt;# Output: JSON with "deny" decision, or empty output to allow&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# Approach: check if the command string contains any protected path.&lt;/span&gt;
&lt;span class="c1"&gt;# This is intentionally simple — a substring match is more reliable than&lt;/span&gt;
&lt;span class="c1"&gt;# trying to parse shell syntax. False positives are rare in practice&lt;/span&gt;
&lt;span class="c1"&gt;# (legitimate commands rarely reference system paths or secret files).&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# Limitations:&lt;/span&gt;
&lt;span class="c1"&gt;#   - Cannot detect paths constructed at runtime via variable expansion&lt;/span&gt;
&lt;span class="c1"&gt;#   - Cannot detect symlinks pointing to protected paths&lt;/span&gt;
&lt;span class="c1"&gt;#   - Will false-positive if a protected path appears as a string literal&lt;/span&gt;

&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-euo&lt;span class="w"&gt; &lt;/span&gt;pipefail&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# see known limitations&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;!&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;command&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-v&lt;span class="w"&gt; &lt;/span&gt;python3&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;&amp;amp;&lt;/span&gt;&amp;gt;/dev/null&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'{"hookSpecificOutput":{"hookEventName":"PreToolUse","permissionDecision":"deny","permissionDecisionReason":"python3 not found — blocking as precaution"}}'&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;exit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="nv"&gt;HOOK_INPUT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;head&lt;span class="w"&gt; &lt;/span&gt;-c&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1048576&lt;/span&gt;&lt;span class="k"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="si"&gt;${#&lt;/span&gt;&lt;span class="nv"&gt;HOOK_INPUT&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-ge&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1048576&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'{"hookSpecificOutput":{"hookEventName":"PreToolUse","permissionDecision":"deny","permissionDecisionReason":"Input too large — blocking as precaution"}}'&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nb"&gt;exit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;HOOK_INPUT

&lt;span class="nb"&gt;exec&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;python3&lt;span class="w"&gt; &lt;/span&gt;-&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;lt;&amp;lt;'PYTHON_EOF'&lt;/span&gt;
&lt;span class="s"&gt;import json, sys, os, re, syslog, signal&lt;/span&gt;

&lt;span class="s"&gt;signal.signal(signal.SIGALRM, lambda *_: deny("Command analysis timeout — blocking as precaution."))&lt;/span&gt;
&lt;span class="s"&gt;signal.alarm(5)&lt;/span&gt;

&lt;span class="s"&gt;HOME = os.path.expanduser("~")&lt;/span&gt;
&lt;span class="s"&gt;command = ""&lt;/span&gt;

&lt;span class="s"&gt;def deny(reason):&lt;/span&gt;
&lt;span class="s"&gt;    try:&lt;/span&gt;
&lt;span class="s"&gt;        syslog.syslog(syslog.LOG_WARNING, f"PATH_GUARD_BLOCK: {reason} | Command: {command[:200]}")&lt;/span&gt;
&lt;span class="s"&gt;    except Exception:&lt;/span&gt;
&lt;span class="s"&gt;        pass&lt;/span&gt;
&lt;span class="s"&gt;    print(json.dumps({&lt;/span&gt;
&lt;span class="s"&gt;        "hookSpecificOutput": {&lt;/span&gt;
&lt;span class="s"&gt;            "hookEventName": "PreToolUse",&lt;/span&gt;
&lt;span class="s"&gt;            "permissionDecision": "deny",&lt;/span&gt;
&lt;span class="s"&gt;            "permissionDecisionReason": reason,&lt;/span&gt;
&lt;span class="s"&gt;        }&lt;/span&gt;
&lt;span class="s"&gt;    }))&lt;/span&gt;
&lt;span class="s"&gt;    sys.exit(0)&lt;/span&gt;

&lt;span class="s"&gt;# Protected paths — if any of these appear in the command, block it.&lt;/span&gt;
&lt;span class="s"&gt;# This list is not exhaustive — see known limitations.&lt;/span&gt;
&lt;span class="s"&gt;PROTECTED = [&lt;/span&gt;
&lt;span class="s"&gt;    "/etc/",&lt;/span&gt;
&lt;span class="s"&gt;    "/usr/",&lt;/span&gt;
&lt;span class="s"&gt;    "/boot/",&lt;/span&gt;
&lt;span class="s"&gt;    "/sbin/",&lt;/span&gt;
&lt;span class="s"&gt;    "/lib/",&lt;/span&gt;
&lt;span class="s"&gt;    HOME + "/.bashrc",&lt;/span&gt;
&lt;span class="s"&gt;    HOME + "/.zshrc",&lt;/span&gt;
&lt;span class="s"&gt;    HOME + "/.profile",&lt;/span&gt;
&lt;span class="s"&gt;    HOME + "/.bash_profile",&lt;/span&gt;
&lt;span class="s"&gt;    HOME + "/.claude/settings.json",&lt;/span&gt;
&lt;span class="s"&gt;    HOME + "/.msmtprc",&lt;/span&gt;
&lt;span class="s"&gt;    HOME + "/.mbsyncrc",&lt;/span&gt;
&lt;span class="s"&gt;    HOME + "/.gnupg/",&lt;/span&gt;
&lt;span class="s"&gt;    HOME + "/.ssh/",&lt;/span&gt;
&lt;span class="s"&gt;    # Secrets / credentials:&lt;/span&gt;
&lt;span class="s"&gt;    HOME + "/.password-store/",&lt;/span&gt;
&lt;span class="s"&gt;    HOME + "/.aws/credentials",&lt;/span&gt;
&lt;span class="s"&gt;    HOME + "/.kube/config",&lt;/span&gt;
&lt;span class="s"&gt;]&lt;/span&gt;

&lt;span class="s"&gt;# Only these paths may be exempted by project-level config.&lt;/span&gt;
&lt;span class="s"&gt;# Home-directory paths can never be exempted.&lt;/span&gt;
&lt;span class="s"&gt;EXEMPTABLE = {"/etc/", "/usr/", "/boot/", "/sbin/", "/lib/"}&lt;/span&gt;

&lt;span class="s"&gt;# Load project-level exemptions from .claude/guard-exempt-paths.json&lt;/span&gt;
&lt;span class="s"&gt;project_dir = os.environ.get("CLAUDE_PROJECT_DIR", "")&lt;/span&gt;
&lt;span class="s"&gt;if project_dir:&lt;/span&gt;
&lt;span class="s"&gt;    exempt_file = os.path.join(project_dir, ".claude", "guard-exempt-paths.json")&lt;/span&gt;
&lt;span class="s"&gt;    try:&lt;/span&gt;
&lt;span class="s"&gt;        with open(exempt_file) as f:&lt;/span&gt;
&lt;span class="s"&gt;            requested = set(json.load(f))&lt;/span&gt;
&lt;span class="s"&gt;        exemptions = requested &amp;amp; EXEMPTABLE&lt;/span&gt;
&lt;span class="s"&gt;        if exemptions:&lt;/span&gt;
&lt;span class="s"&gt;            PROTECTED = [p for p in PROTECTED if p not in exemptions]&lt;/span&gt;
&lt;span class="s"&gt;            syslog.syslog(syslog.LOG_INFO,&lt;/span&gt;
&lt;span class="s"&gt;                f"PATH_GUARD_EXEMPT: {sorted(exemptions)} for {project_dir}")&lt;/span&gt;
&lt;span class="s"&gt;    except (FileNotFoundError, json.JSONDecodeError, TypeError):&lt;/span&gt;
&lt;span class="s"&gt;        pass&lt;/span&gt;

&lt;span class="s"&gt;# Also match ~/. shorthand versions&lt;/span&gt;
&lt;span class="s"&gt;TILDE_PROTECTED = ["~/" + p[len(HOME)+1:] for p in PROTECTED if p.startswith(HOME + "/")]&lt;/span&gt;

&lt;span class="s"&gt;# Parse input&lt;/span&gt;
&lt;span class="s"&gt;raw_input = os.environ.get("HOOK_INPUT", "")&lt;/span&gt;
&lt;span class="s"&gt;try:&lt;/span&gt;
&lt;span class="s"&gt;    data = json.loads(raw_input)&lt;/span&gt;
&lt;span class="s"&gt;    command = data.get("tool_input", {}).get("command", "")&lt;/span&gt;
&lt;span class="s"&gt;except Exception:&lt;/span&gt;
&lt;span class="s"&gt;    deny("Path-guard hook could not parse input — blocking as precaution.")&lt;/span&gt;

&lt;span class="s"&gt;if not command:&lt;/span&gt;
&lt;span class="s"&gt;    sys.exit(0)&lt;/span&gt;

&lt;span class="s"&gt;# Check for protected paths in the command&lt;/span&gt;
&lt;span class="s"&gt;for path in PROTECTED + TILDE_PROTECTED:&lt;/span&gt;
&lt;span class="s"&gt;    if path in command:&lt;/span&gt;
&lt;span class="s"&gt;        deny(f"Command references protected path: {path}")&lt;/span&gt;

&lt;span class="s"&gt;# Allow&lt;/span&gt;
&lt;span class="s"&gt;sys.exit(0)&lt;/span&gt;
&lt;span class="s"&gt;PYTHON_EOF&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Of course, you need to save it somewhere that makes sense for you, then make it executable with &lt;code&gt;chmod +x&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The script works as two layers, the outer being Bash and inner being Python. Bash is needed first because that's what Claude Code's hook system expects. But it's tricky to parse JSON with Bash, whereas that's Python's thing. So, Bash reads the incoming data and hands it to Python for the actual inspection.&lt;/p&gt;
&lt;h3 id="why-simple-text-matching-for-the-hook-script"&gt;Why simple text matching for the hook script?&lt;/h3&gt;
&lt;p&gt;You might wonder why the script uses simple text matching. After all, it just looks for protected paths like &lt;code&gt;/etc/&lt;/code&gt; or &lt;code&gt;~/.ssh/&lt;/code&gt; anywhere in the command text. That's obviously a blunt instrument and you may be tempted to go for a a more sophisticated approach, like separating out which parts are actual file paths, and which are arguments or text strings.&lt;/p&gt;
&lt;p&gt;It might work for you but I tried that and it was too fragile because common command formats would trip up the parser. In the end, I settled on the simpler solution because it avoids that false sense of security. Yes, it can be fooled by a malicious agent, but at least there's no pretense of defending against edge-cases. The problems you know about are easier to deal with than the unknowns.&lt;/p&gt;
&lt;h3 id="per-project-exemptions"&gt;Per-project exemptions&lt;/h3&gt;
&lt;p&gt;There's one big problem, however. I run agents that manage remote servers which regularly need to run commands that contain a protected path. For example:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;ssh myserver "docker exec caddy reload --config /etc/caddy/Caddyfile"&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;/etc/caddy/Caddyfile&lt;/code&gt; path is inside a Docker container on a remote host, not on my local machine, but the substring match catches it anyway because it sees &lt;code&gt;/etc/&lt;/code&gt; in the command text.&lt;/p&gt;
&lt;p&gt;The fix is per-project exemptions. A project that needs to reference system paths in remote commands can create a &lt;code&gt;.claude/guard-exempt-paths.json&lt;/code&gt; file in its project root:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"/etc/"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Only exempt what you actually need. The full set of exemptable paths is &lt;code&gt;/etc/&lt;/code&gt;, &lt;code&gt;/usr/&lt;/code&gt;, &lt;code&gt;/boot/&lt;/code&gt;, &lt;code&gt;/sbin/&lt;/code&gt;, and &lt;code&gt;/lib/&lt;/code&gt;, but most projects only need one or two.&lt;/p&gt;
&lt;p&gt;The hook knows which project it's running in because Claude Code passes that information through an environment variable. If it finds a &lt;code&gt;guard-exempt-paths.json&lt;/code&gt; file in the project, it skips those paths when checking commands for that project.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;I'd rather have a short script that's honest about its gaps than a long one that pretends it doesn't have any.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;The obvious risk is that a compromised project config could exempt everything and effectively disable the hook. So I built a hard limit into the script: only the five system-directory paths listed above can be exempted (you can see the &lt;code&gt;EXEMPTABLE&lt;/code&gt; list in the code). Even if someone adds &lt;code&gt;"~/.ssh/"&lt;/code&gt; to the exemption file, the script ignores it and the hook always checks for home directory paths regardless of any exemption file.&lt;/p&gt;
&lt;p&gt;This is also where the layers work together. Layer 1 (deny permissions) still blocks Write and Edit to system paths unconditionally, regardless of any project-level configuration. The exemption only affects the Bash hook's substring check so even in a project with all five system paths exempted, an agent still can't write to &lt;code&gt;/etc/&lt;/code&gt; because the deny rules won't allow it.&lt;/p&gt;
&lt;p&gt;It's important to be aware that this isn't a silver bullet because an agent could edit its own rules,&lt;sup&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#known-limitations"&gt;*&lt;/a&gt;&lt;/sup&gt; so you need to adjust permissions to match your situation. This honestly needs ongoing work as it isn't a solved problem and the tooling is still evolving. You do your best with what's available and be conscious that we don't yet have adequate solutions. Defence in depth is the key here.&lt;/p&gt;
&lt;h3 id="registering-the-hook"&gt;Registering the hook&lt;/h3&gt;
&lt;p&gt;The hook goes in your &lt;code&gt;~/.claude/settings.json&lt;/code&gt;:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;"PreToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bash"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="nt"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="nt"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/path/to/guard-sensitive-paths.sh"&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You're not limited to a single hook, either. Say you have other commands you want to guard against, like outbound email or database access. You can still add separate hooks for each area. Multiple hooks on the same matcher run in parallel, and all of them must pass for the command to execute. The command is blocked if any one returns a deny decision.&lt;/p&gt;
&lt;h2 id="testing-your-security-setup"&gt;Testing your security setup&lt;/h2&gt;
&lt;p&gt;Once you've saved the settings and the hook script, test it:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Try a command that references a protected path, like &lt;code&gt;cat /etc/passwd&lt;/code&gt;. It should be blocked with a message telling you which path triggered it.&lt;/li&gt;
&lt;li&gt;Check your system log for a &lt;code&gt;PATH_GUARD_BLOCK&lt;/code&gt; entry.&lt;/li&gt;
&lt;li&gt;Try a normal command like &lt;code&gt;ls -la&lt;/code&gt;. It should work fine.&lt;/li&gt;
&lt;li&gt;If you've set up a per-project exemption, try the same &lt;code&gt;/etc/&lt;/code&gt; command from that project. It should be allowed.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;You have two options if there's a false positive: either add the specific path to a per-project exemption if it's a system path, or adjust the PROTECTED list in the hook if the path doesn't need protecting.&lt;/p&gt;
&lt;h2 id="whats-next-for-locking-down-claude-code"&gt;What's next for locking down Claude Code&lt;/h2&gt;
&lt;p&gt;The sandbox is more secure but it wasn't workable for a lot of our projects. This setup is the trade-off I settled on: agents get the access they need for SSH, remote servers, and file management. Plus, the deny rules and hooks catch the things I don't want them touching. It's been running for a few months and it's stable.&lt;/p&gt;
&lt;p&gt;I'm still not fully comfortable with where things are because the &lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/"&gt;OpenClaw disclosures&lt;/a&gt; showed what happens when agent frameworks don't think about credential isolation. Shortly after publishing the first version of this article, a supply chain attack on LiteLLM&lt;sup&gt;&lt;a href="https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/#fn4"&gt;4&lt;/a&gt;&lt;/sup&gt; deployed a credential harvester that swept SSH keys, cloud secrets, and API tokens from every environment it touched. It was live on PyPI for three hours.&lt;/p&gt;
&lt;p&gt;These are real incidents affecting production systems because people jumped on the bandwagon too early without thinking of the implications.&lt;/p&gt;
&lt;p&gt;I've been running Another Cup of Coffee for over twenty years and if there's one thing I've learned, it's that you don't adopt experimental technology without understanding where it risks your business. Our clients trust us with their infrastructure and their data. While we have the luxury of being able to move quickly, we're also careful not to break things for the people who trust us with their livelihoods. So we build what protections we can, we're honest about the gaps, and we keep watching. I'll write more as the tooling evolves.&lt;/p&gt;
&lt;p&gt;If you're not sure where you stand with any of this, I'm happy to &lt;a href="https://anothercoffee.net/contact/"&gt;have a conversation&lt;/a&gt; about it.&lt;/p&gt;
&lt;section class="mt-4 pt-4"&gt;
&lt;h3 class="text-center pb-4"&gt;Common Questions&lt;/h3&gt;
&lt;div class="container border bg-light p-4"&gt;
&lt;p&gt;&lt;strong&gt;Can I use this alongside the sandbox?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Yes. The deny permissions and hooks work whether the sandbox is on or off. If your workflow doesn't need SSH or file deletion, you could run the sandbox for its kernel-level isolation and still add deny permissions and hooks as additional layers. They don't conflict.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Does this work on macOS?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Yes. The deny permissions and hook registration are identical on macOS. The hook script needs Python 3 and bash. macOS ships bash but not Python 3; you'll need to install Python via Xcode Command Line Tools (&lt;code&gt;xcode-select --install&lt;/code&gt;) or Homebrew. The only other difference is how you read the audit log: use &lt;code&gt;log show&lt;/code&gt; instead of &lt;code&gt;journalctl&lt;/code&gt;. If you're using the sandbox path, macOS uses Seatbelt instead of bubblewrap and doesn't need socat.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What happens if I get a false positive?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The hook blocks the command and tells you which path triggered it. You can either adjust the PROTECTED list in the hook, or if it's a sysadmin project that legitimately references system paths on remote servers, create a per-project exemption file. The deny permissions (Layer 1) never produce false positives because they only apply to Write and Edit, not Read or Bash.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What about Windows / WSL?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;WSL2 runs a real Linux kernel, so everything in this guide works as-is. WSL1 does not support user namespaces, so the bubblewrap sandbox won't work there, but the deny permissions and hooks will. If you're on WSL1, the hook-based approach described here is your best option.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Do I need a dedicated user account?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;No, but it's the strongest single thing you can do. Running Claude Code under a separate account means the agent can't access your personal files, SSH keys, or shell configuration because Unix permissions won't let it cross accounts. The rest of this guide works either way. If you run Claude Code as your own user, the deny rules and hooks are doing more of the heavy lifting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Should I read the companion article first?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;It helps, but you don't need to. &lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/"&gt;That article&lt;/a&gt; covers the threat landscape and security layers conceptually. This one is the implementation. If you want to understand &lt;em&gt;why&lt;/em&gt; these protections matter, read that first. If you just want to set them up, you're in the right place.&lt;/p&gt;
&lt;/div&gt;
&lt;/section&gt;

&lt;section id="known-limitations" class="mt-4 pt-4"&gt;
&lt;h3 class="text-center pb-4"&gt;Known Limitations&lt;/h3&gt;
&lt;div class="container border bg-light p-4"&gt;
&lt;p&gt;This setup is experimental and actively evolving. The following gaps are ones I'm aware of and haven't yet resolved:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The hook script uses &lt;code&gt;set -euo pipefail&lt;/code&gt;&lt;/strong&gt;, which means it exits immediately on unexpected errors. I haven't yet confirmed whether Claude Code treats a crashed hook (non-zero exit, no output) as allow or deny. If it's allow, this is a fail-open vulnerability. I'm testing this and will update the script when I have a definitive answer.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No &lt;code&gt;Read&lt;/code&gt; deny rules.&lt;/strong&gt; The deny block only covers Write and Edit. Claude Code's own Read tool can still access sensitive files like &lt;code&gt;~/.ssh/&lt;/code&gt; unless you've set up a dedicated user account. The hook catches Bash reads (like &lt;code&gt;cat ~/.ssh/...&lt;/code&gt;) but not Read tool calls. For SSH keys specifically, ssh-agent is the primary protection.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The PROTECTED list is not exhaustive.&lt;/strong&gt; Files like &lt;code&gt;~/.bash_history&lt;/code&gt;, &lt;code&gt;~/.zsh_history&lt;/code&gt;, browser profiles, and other sensitive data in your home directory are not covered by the hook. Add paths relevant to your setup.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Auto-approved commands have limits.&lt;/strong&gt; &lt;code&gt;Bash(cat:*)&lt;/code&gt; and &lt;code&gt;Bash(cp:*)&lt;/code&gt; are caught by the hook when they reference protected paths, but only paths in the PROTECTED list. A &lt;code&gt;cat&lt;/code&gt; on a sensitive file not in that list will pass both the allow rule and the hook.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Agents can create their own exemption files.&lt;/strong&gt; Since agents can write to project directories, they can create or modify &lt;code&gt;.claude/guard-exempt-paths.json&lt;/code&gt; to exempt system paths. The hard limit in the script (only system directories, never home paths) contains this, but it's worth being aware of.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The code in this article is shared from my own setup. Use it as a starting point, not a finished solution. If you find other gaps or have improvements, I'd like to &lt;a href="https://anothercoffee.net/contact/"&gt;hear about them&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;/section&gt;

&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;This article is part of &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;an ongoing series&lt;/a&gt; on how Another Cup of Coffee is adapting to AI. &lt;a href="https://anothercoffee.net/categories/ai/"&gt;Explore all articles in this series&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;div class="mt-5"&gt;
    &lt;h3&gt;You may also like&lt;/h3&gt;

    &lt;div class="row"&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/trust-but-verify-card-300x150.jpg" class="card-img-top" alt="AI coding tool security and sandboxing"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/" class="listtitle"&gt;Trust, But Verify: What's Really Between Your AI Coding Tool and Your SSH Keys&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2026-03-12T12:00:00Z" title="12 March 2026"&gt;12 March 2026&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;AI coding tools run with your full user permissions. I looked at what's actually protecting developers, what isn't, and what you should do about it.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/openclaw-security-card-300x150.jpg" class="card-img-top" alt="Red lobster on a white plate"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/" class="listtitle"&gt;What OpenClaw Teaches Us About AI Agent Security&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2026-02-22T12:00:00Z" title="22 February 2026"&gt;22 February 2026&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;OpenClaw's security crisis exposed real problems with how AI agents handle credentials, plugins, and system access. Here's what went wrong and how a convention-based approach avoids these risks entirely.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/run-dozens-of-projects-ai-card-300x150.jpg" class="card-img-top" alt="One person running dozens of projects with AI agents"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/" class="listtitle"&gt;I Run Dozens of Projects with AI. The Hard Part Isn't the AI.&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2025-12-20T12:00:00Z" title="20 December 2025"&gt;20 December 2025&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;One person, dozens of projects, four AI vendors. I spent a year building a coordination system for AI agents. The components are simple. Getting them right was not.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;div class="mt-4 pt-4 text-muted small border-top border-bottom"&gt;
    &lt;h3 class="text-muted small"&gt;Footnotes&lt;/h3&gt;
    &lt;ol&gt;
      &lt;li id="fn1"&gt;Claude Code documentation, &lt;a href="https://docs.anthropic.com/en/docs/claude-code" target="_blank" rel="noopener noreferrer"&gt;Settings and permissions&lt;/a&gt;.&lt;/li&gt;
      &lt;li id="fn2"&gt;&lt;a href="https://github.com/containers/bubblewrap" target="_blank" rel="noopener noreferrer"&gt;bubblewrap&lt;/a&gt; on GitHub. Unprivileged sandboxing tool using Linux kernel namespaces.&lt;/li&gt;
      &lt;li id="fn3"&gt;GitHub issues &lt;a href="https://github.com/anthropics/claude-code/issues/6631" target="_blank" rel="noopener noreferrer"&gt;#6631&lt;/a&gt;, &lt;a href="https://github.com/anthropics/claude-code/issues/6699" target="_blank" rel="noopener noreferrer"&gt;#6699&lt;/a&gt;, and &lt;a href="https://github.com/anthropics/claude-code/issues/27040" target="_blank" rel="noopener noreferrer"&gt;#27040&lt;/a&gt; on the Claude Code repository document cases where deny rules were silently bypassed across multiple versions.&lt;/li&gt;
      &lt;li id="fn4"&gt;LiteLLM, &lt;a href="https://docs.litellm.ai/blog/security-update-march-2026" target="_blank" rel="noopener noreferrer"&gt;Security Update: Suspected Supply Chain Incident&lt;/a&gt;, March 2026. Malicious versions 1.82.7 and 1.82.8 were live on PyPI for approximately three hours before being quarantined.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Featured image: Photo by &lt;a href="https://unsplash.com/@apsprudente?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText" target="_blank" rel="noopener noreferrer"&gt;Patricia Prudente&lt;/a&gt; on &lt;a href="https://unsplash.com/photos/girl-sitting-on-hammock-between-plants-jNdQoB0ziTE?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText" target="_blank" rel="noopener noreferrer"&gt;Unsplash&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;

&lt;script type="application/ld+json"&gt;
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "Can I use this alongside the sandbox?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Yes. The deny permissions and hooks work whether the sandbox is on or off. If your workflow doesn't need SSH or file deletion, you could run the sandbox for its kernel-level isolation and still add deny permissions and hooks as additional layers. They don't conflict."
      }
    },
    {
      "@type": "Question",
      "name": "Does this work on macOS?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Yes. The deny permissions and hook registration are identical on macOS. The hook script needs Python 3 and bash. macOS ships bash but not Python 3; you'll need to install Python via Xcode Command Line Tools (xcode-select --install) or Homebrew. The only other difference is how you read the audit log: use log show instead of journalctl. If you're using the sandbox path, macOS uses Seatbelt instead of bubblewrap and doesn't need socat."
      }
    },
    {
      "@type": "Question",
      "name": "What happens if I get a false positive?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "The hook blocks the command and tells you which path triggered it. You can either adjust the PROTECTED list in the hook, or if it's a sysadmin project that legitimately references system paths on remote servers, create a per-project exemption file. The deny permissions (Layer 1) never produce false positives because they only apply to Write and Edit, not Read or Bash."
      }
    },
    {
      "@type": "Question",
      "name": "What about Windows / WSL?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "WSL2 runs a real Linux kernel, so everything in this guide works as-is. WSL1 does not support user namespaces, so the bubblewrap sandbox won't work there, but the deny permissions and hooks will. If you're on WSL1, the hook-based approach described here is your best option."
      }
    },
    {
      "@type": "Question",
      "name": "Do I need a dedicated user account?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "No, but it's the strongest single thing you can do. Running Claude Code under a separate account means the agent can't access your personal files, SSH keys, or shell configuration because Unix permissions won't let it cross accounts. The rest of this guide works either way. If you run Claude Code as your own user, the deny rules and hooks are doing more of the heavy lifting."
      }
    },
    {
      "@type": "Question",
      "name": "Should I read the companion article first?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "It helps, but you don't need to. That article covers the threat landscape and security layers conceptually. This one is the implementation. If you want to understand why these protections matter, read that first. If you just want to set them up, you're in the right place."
      }
    }
  ]
}
&lt;/script&gt;</description><category>AI</category><category>AI Security</category><category>Claude Code</category><category>Developer Tools</category><category>Hooks</category><category>Permissions</category><category>Prompt Injection</category><category>Security</category><guid>https://anothercoffee.net/the-practical-guide-to-locking-down-claude-code/</guid><pubDate>Mon, 23 Mar 2026 12:00:00 GMT</pubDate></item><item><title>VibeOps: Let AI Do the Prep, Not the Decisions</title><link>https://anothercoffee.net/you-dont-need-vibe-devops/</link><dc:creator>Anthony Lopez-Vito</dc:creator><description>&lt;figure&gt;&lt;img src="https://anothercoffee.net/images/posts/vibe-devops-og-1200x630.jpg"&gt;&lt;/figure&gt; &lt;p class="intro"&gt;There's a growing conversation about whether we need "VibeOps", an AI tool that reads your repo and automatically sets up CI/CD, containerisation, scaling, and infrastructure. In my experience the idea addresses a real gap. AI tools can generate frontend and backend code rapidly, but getting code to production safely still requires judgment.&lt;/p&gt;

&lt;p&gt;I do get the appeal though. But automating deployment decisions is a different problem to automating code generation, and the consequences of getting it wrong are much worse.&lt;/p&gt;
&lt;p&gt;At Another Cup of Coffee, I run a setup where AI agents handle most of the software development workflow: writing code, coordinating across projects, drafting documentation, managing communications. We use CI/CD where it fits the project, but the deployment &lt;em&gt;decisions&lt;/em&gt; stay human-gated, quite deliberately. So why not automate the lot and how does our setup actually work in practice?&lt;/p&gt;
&lt;h3 id="the-setup-ai-agents-that-dont-deploy-themselves"&gt;The Setup: AI Agents That Don't Deploy Themselves&lt;/h3&gt;
&lt;p&gt;Our development environment runs on a &lt;a href="https://anothercoffee.net/building-an-operating-environment-for-ai-agents/"&gt;multi-agent architecture&lt;/a&gt;. Currently that's Claude Code, Codex, and Gemini CLI across different projects, with a number of specialised agents for each one. A developer agent writes code, a sysadmin agent manages infrastructure, a writer agent handles content, a project manager coordinates timelines. You get the idea. I appreciate that sounds like chatbots bolted onto an IDE, but it really isn't. They're session-based agents with persistent state, each with access to the filesystem, shell commands, and when required, other agents and projects. The architecture is vendor-neutral; each project has its own AI provider through vendor-specific instruction files, and the coordination conventions work identically regardless of which tool is running.&lt;/p&gt;
&lt;p&gt;Agents communicate through a memo system. When the developer agent finishes a build, it doesn't trigger a deployment pipeline. Instead, it writes a memo to the sysadmin agent's inbox describing what was built, what changed, and what infrastructure it needs. The sysadmin agent reads the memo, reviews the requirements, prepares the deployment configuration, and presents the steps to a human operator for execution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Agents prepare deployments. Humans decide when and how to execute them.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;A typical deployment flow looks like this:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The developer agent builds the application and produces artifacts. It might, for example, be a static site, a Docker image, a database migration, or all three.&lt;/li&gt;
&lt;li&gt;It writes a deployment memo to the sysadmin project's inbox with full context on what's being deployed, what services it depends on, what environment variables it needs, and what verification steps should follow.&lt;/li&gt;
&lt;li&gt;The sysadmin agent reads the memo, creates or updates the Docker Compose configuration and reverse proxy rules, and writes a step-by-step runbook.&lt;/li&gt;
&lt;li&gt;The human operator reviews the runbook. Depending on the operation, they either execute it manually via SSH or approve the agent to run it directly.&lt;/li&gt;
&lt;li&gt;The sysadmin agent updates its state file and marks the memo as complete.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;We don't draw a hard line between what agents handle and what humans do. Some operations the agent runs directly, like copying build artifacts to a staging server. Others get a human at the keyboard, particularly anything involving firewall rules on a production box. It depends on how much damage a mistake could do, and that changes from one operation to the next.&lt;/p&gt;
&lt;h3 id="why-not-automate-deployment"&gt;Why Not Automate Deployment?&lt;/h3&gt;
&lt;p&gt;For most people, the vision behind VibeOps is an AI that reads your code and decides how it should run. Reading the code is the easy part, but how something &lt;em&gt;should&lt;/em&gt; run depends on context that lives nowhere near the code.&lt;/p&gt;
&lt;p&gt;A startup with a single VPS has different deployment requirements to an enterprise with a large AWS budget. The code could be identical, but the infrastructure decisions are completely different. An AI that auto-provisions "optimal" infrastructure has no concept of your monthly spend limit unless you tell it, and if you're telling it all your constraints, you're just writing a different kind of configuration file. You haven't automated the decision-making, you've moved it from a hosting control panel to an AI prompt.&lt;/p&gt;
&lt;p&gt;It's the same thing with traffic patterns. A project serving a handful of internal users and a project facing the public internet can share the same general structure but need radically different scaling and security configurations. The AI would need to know a whole bunch of things beyond just your current traffic, like your expected traffic, your tolerance for downtime, and your plan for traffic spikes. These are business decisions rather than technical ones.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;On a fixed-capacity machine, uncontrolled automation crashes you. On cloud infrastructure, it bankrupts you.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;There's also the cost runaway problem. On cloud infrastructure, uncontrolled automation risks running up your bill. The combination of a modest traffic spike and an auto-scaling rule that's a bit too eager, and suddenly you owe your cloud provider a fortune. On a fixed-capacity machine, the failure is different but the root cause is the same. Instead of a surprise bill, you get a frozen system.&lt;/p&gt;
&lt;p&gt;I hit the same problem at smaller scale a few weeks ago. One of my development environments runs on a modest workstation, an i3-6100 with 8GB of RAM, nothing fancy. I'd been a little careless about Docker container management as I was deep into a project. Four separate stacks were running simultaneously, fourteen containers in total, all set to &lt;code&gt;restart: always&lt;/code&gt; so they'd come back up after every reboot whether I needed them or not. One afternoon the machine just froze. It was completely unresponsive for about twenty minutes, right in the middle of some urgent work. I couldn't even get a terminal so had to hard reboot.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The dangerous failure mode of automated infrastructure is that it works too well.&lt;/strong&gt; It scales resources you didn't intend to scale, restarts services you meant to stop, provisions capacity you can't afford.&lt;/p&gt;
&lt;h3 id="keeping-the-infrastructure-simple"&gt;Keeping the Infrastructure Simple&lt;/h3&gt;
&lt;p&gt;The stacks we build for clients tend to follow the same principle: every piece should be something you can fully understand and troubleshoot without too much effort. Containerised applications, a reverse proxy that handles HTTPS automatically, hardened servers, scheduled backups, encrypted secrets. Kubernetes and Terraform solve real problems at scale, but for many of our clients who run a few services, they're way too much overhead for what you get back.&lt;/p&gt;
&lt;p&gt;One thing I always care about is rollback. Deployment configurations are version-controlled in git, so if something goes wrong you redeploy the previous version. For data, you restore from your most recent scheduled backup. The simpler the stack, the easier this is. All you do is put the old files back and restart.&lt;/p&gt;
&lt;h3 id="how-ai-agents-coordinate-deployments"&gt;How AI Agents Coordinate Deployments&lt;/h3&gt;
&lt;p&gt;The memo system deserves more detail because it's the part that most resembles the VibeOps vision, which is AI understanding your project and making infrastructure decisions, while keeping humans in control.&lt;/p&gt;
&lt;p&gt;Each agent project has an inbox (&lt;code&gt;memos/incoming/&lt;/code&gt;) and an archive (&lt;code&gt;memos/archived/&lt;/code&gt;). Every memo follows the same structure. There's a header with sender, date, and priority, then context, then action items with checkboxes, then a completion section. I know this sounds like bureaucracy, and honestly it sort of is. But any agent can read any other agent's correspondence, and when an agent finishes processing a memo it fills in the completion notes and moves it to the archive. The result is that every infrastructure decision has a paper trail; when something breaks, you can trace back through archived memos to see exactly what was deployed, when, why, and by which agent.&lt;/p&gt;
&lt;p&gt;Agents are constrained to their scope through layered controls. Most can only interact with other projects through the memo system, and any operation that could affect a live server gets presented to a human operator first.&lt;/p&gt;
&lt;p&gt;Cross-project coordination happens asynchronously. If a security concern is discovered, the sysadmin agent can write memos to all affected projects describing the new access control policy. Each project's agent processes the memo independently. No central orchestrator, no shared state, no single point of failure.&lt;/p&gt;
&lt;p&gt;The agents also maintain state files (&lt;code&gt;STATE.md&lt;/code&gt;) that track current status, recent progress, next actions, and handover notes. When a new conversation starts, the agent reads its state file and pending memos to understand where things left off. Agents don't need persistent memory of every past conversation. They reconstruct context from documentation, the same way a human engineer would read a project's README and recent commit history before starting work. I've written about how this scales across dozens of projects in &lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/"&gt;Run Dozens of Projects with AI&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="why-access-between-projects-is-deliberate"&gt;Why Access Between Projects Is Deliberate&lt;/h4&gt;
&lt;p&gt;Unlike the hub-and-spoke model most agent frameworks follow, our architecture uses a mesh. Each project is an autonomous node with its own state, its own instructions, and its own agent identity. Further, I can set things up so that agents can't see other projects by deploying to separate machines or instances. When an agent does need broader access (a coordinator that works across multiple projects, for example), that access is granted explicitly.&lt;/p&gt;
&lt;p&gt;This matters because a single AI tool managing all your infrastructure at once is a hub model. One compromise, misconfiguration, or simply a bad judgment call, affects everything. A mesh where access is deliberately granted rather than assumed by default limits the damage when something goes wrong.&lt;/p&gt;
&lt;p&gt;I've written in more detail about this architecture in &lt;a href="https://anothercoffee.net/building-an-operating-environment-for-ai-agents/"&gt;Building an Operating Environment for AI Agents&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id="security-the-part-vibeops-would-get-wrong"&gt;Security: The Part VibeOps Would Get Wrong&lt;/h3&gt;
&lt;p&gt;Automated security scanning, patching, and monitoring are well-established and genuinely useful. The problem here is different: an AI deciding how to containerise your application, configure your deployment pipeline, and scale your infrastructure, all based on reading your repo. Those decisions depend on context the tool either can't see or that would be too time-consuming to keep feeding in.&lt;/p&gt;
&lt;p&gt;That's why we keep humans in the loop for those decisions, and layer security controls on top so the agents can't overstep even when they're handling the routine parts. If one layer misses something, the next one hopefully catches it. You can read more about how we approach this in &lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/"&gt;What OpenClaw Teaches Us About AI Agent Security&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The agents are session-based. This means that when I'm not actively working with them, there's no daemon or background service running. At this stage of AI development, I don't think agents are reliable enough to be making infrastructure decisions unsupervised. That will likely change in the future, but right now I'd rather not risk my business or my clients' by having them work while I'm not looking.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;Right now I'd rather not risk my business or my clients' by having AI agents work while I'm not looking.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;h3 id="what-vibeops-should-actually-be"&gt;What VibeOps Should Actually Be&lt;/h3&gt;
&lt;p&gt;There is a real gap between AI-generated code and production deployment, and I think it will get filled well eventually. Right now, for most businesses our size, the practical approach is to let AI handle the preparation but keep a human on the decisions that actually matter.&lt;/p&gt;
&lt;p&gt;In practice that means AI can generate your deployment configurations, write your runbooks, and tell you what it doesn't know. What it shouldn't be doing yet is executing changes to live infrastructure without someone looking at them first. That boundary will shift as the tools get more reliable, but for now the risk of getting it wrong is too high.&lt;/p&gt;
&lt;p&gt;Our agent system works this way. The sysadmin agent prepares configurations, writes runbooks, and flags risks, but it presents everything to a human before anything touches a production server. I'm still figuring out exactly where the boundary sits (it moves as I get more confident in the guardrails), but the principle holds: humans spend their time on decisions, not on remembering technical steps.&lt;/p&gt;
&lt;h3 id="getting-from-vibe-coding-to-production"&gt;Getting from Vibe Coding to Production&lt;/h3&gt;
&lt;p&gt;If you're currently stuck between AI-generated code and manual deployments with no clear path between them, the good news is that most of the pieces already exist. You probably just haven't connected them yet.&lt;/p&gt;
&lt;p&gt;The starting point is to containerise your applications so they run the same way everywhere, put a reverse proxy in front that handles HTTPS automatically, and harden your server before you deploy anything. If you have a developer or technical partner handling this, get them to write deployment runbooks rather than relying on scripts that fail silently. A runbook that says "run this, check this, then run this" is easier to review and safer to hand off than a bash script that does everything at once.&lt;/p&gt;
&lt;p&gt;If you're already using AI coding tools, use them for deployment preparation too. Have them generate your configurations and templates. Make sure you review the output. Let them handle the low-risk steps, and keep your hands on the controls for the rest.&lt;/p&gt;
&lt;hr&gt;
&lt;!-- Common Questions --&gt;
&lt;section class="mt-4 pt-4"&gt;
&lt;h3 class="text-center pb-4"&gt;Common Questions&lt;/h3&gt;
&lt;div class="container border bg-light p-4"&gt;
&lt;p&gt;&lt;strong&gt;What is VibeOps (vibe DevOps)?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;VibeOps is an emerging concept for AI tools that automatically read your codebase and set up deployment infrastructure, CI/CD pipelines, containerisation, and scaling. Sometimes called "vibe DevOps," the idea is to bridge the gap between AI-generated code and production deployment without manual configuration.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Can AI agents safely handle production deployments?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;AI agents can safely prepare deployments by generating configurations, writing runbooks, and flagging risks. What they shouldn't be doing yet is executing changes to live infrastructure without a human reviewing them first. As these tools mature this will likely change, but right now a human-in-the-loop approach is the safest way to get the benefits without the risk.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What's the risk of fully automated cloud deployment?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The main risk is that automation can work too well. On cloud infrastructure, a modest traffic spike combined with an eager auto-scaling rule can run up a significant bill. On a fixed-capacity machine, uncontrolled automation can freeze your system entirely. In both cases the problem is the same: automation doing more than anyone intended, with no human checkpoint to catch it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What's the best approach to deployment for small businesses?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Keep the infrastructure simple enough that you can fully understand and troubleshoot it. Containerise your applications, automate HTTPS, harden your servers, and write deployment runbooks that a human can review. If you're using AI coding tools, use them for deployment preparation too, but keep your hands on the controls for anything that touches a live server.&lt;/p&gt;
&lt;/div&gt;
&lt;/section&gt;

&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;This article is part of &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;an ongoing series&lt;/a&gt; on how Another Cup of Coffee is adapting to AI. &lt;a href="https://anothercoffee.net/categories/ai/"&gt;Explore all articles in this series&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;!-- "You may also like" cards --&gt;
&lt;div class="mt-5"&gt;
    &lt;h3&gt;You may also like&lt;/h3&gt;

    &lt;div class="row"&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/building-an-operating-environment-for-ai-agents/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/aoe-howibuild-card-300x150.jpg" class="card-img-top" alt="Building an Operating Environment for AI Agents"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/building-an-operating-environment-for-ai-agents/" class="listtitle"&gt;Building an Operating Environment for AI Agents&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2025-05-15" title="15 May 2025"&gt;15 May 2025&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;How markdown files and conventions turned CLI agent tools into a coordination system running 44 projects across 14 organisations.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/openclaw-security-card-300x150.jpg" class="card-img-top" alt="What OpenClaw Teaches Us About AI Agent Security"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/" class="listtitle"&gt;What OpenClaw Teaches Us About AI Agent Security&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2026-02-17" title="17 February 2026"&gt;17 February 2026&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;What an open-source AI agent framework reveals about the security challenges of giving AI tools access to your filesystem and shell.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/run-dozens-of-projects-ai-card-300x150.jpg" class="card-img-top" alt="One person running dozens of projects with AI agents"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/" class="listtitle"&gt;Run Dozens of Projects with AI&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2026-03-06" title="6 March 2026"&gt;6 March 2026&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;How a markdown-based working memory system and session protocol lets AI agents coordinate across dozens of active projects.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;div class="mt-4 pt-4 text-muted small border-top border-bottom"&gt;
    &lt;p&gt;Featured image photo by &lt;a href="https://unsplash.com/@hanswestbeek?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText" target="_blank" rel="nofollow noopener noreferrer"&gt;Hans Westbeek&lt;/a&gt; on &lt;a href="https://unsplash.com/photos/red-motor-stop-button-on-a-metal-panel-Po6upO2VQig?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText" target="_blank" rel="nofollow noopener noreferrer"&gt;Unsplash&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;</description><category>AI</category><category>AI Agents</category><category>AOE</category><category>Deployment Automation</category><category>DevOps</category><category>Docker Compose</category><category>Infrastructure</category><category>Security</category><category>Vibe DevOps</category><category>VibeOps</category><guid>https://anothercoffee.net/you-dont-need-vibe-devops/</guid><pubDate>Fri, 20 Mar 2026 12:00:00 GMT</pubDate></item><item><title>Trust, But Verify: What's Really Between Your AI Coding Tool and Your SSH Keys</title><link>https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/</link><dc:creator>Anthony Lopez-Vito</dc:creator><description>&lt;figure&gt;&lt;img src="https://anothercoffee.net/images/posts/trust-but-verify-og-1200x630.jpg"&gt;&lt;/figure&gt; &lt;p&gt;Stack Overflow's 2025 survey&lt;sup&gt;&lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/#fn1"&gt;1&lt;/a&gt;&lt;/sup&gt; found that 84% of developers are now using AI coding tools, and over half use them daily. In December 2025, security researcher Ari Marzouk disclosed over 30 vulnerabilities&lt;sup&gt;&lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/#fn2"&gt;2&lt;/a&gt;&lt;/sup&gt; across Cursor, Windsurf, GitHub Copilot, Zed, Roo Code, Junie, and Cline. He called it IDEsaster, and I think the name fits.&lt;/p&gt;
&lt;p&gt;I use these tools too at Another Cup of Coffee. I run AI agents across dozens of projects and they've genuinely changed how I work. However, I also spent a fair amount of time last year thinking about what's actually standing between the AI and my private keys, my client credentials, my source code. The answer turned out to be more interesting and more uneven than I expected, so I wrote it up.&lt;/p&gt;
&lt;h2 id="whats-been-going-wrong-with-ai-coding-tool-security"&gt;What's been going wrong with AI coding tool security&lt;/h2&gt;
&lt;p&gt;The IDEsaster disclosure was just the start. Johann Rehberger spent August 2025 disclosing one AI tool vulnerability per day&lt;sup&gt;&lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/#fn3"&gt;3&lt;/a&gt;&lt;/sup&gt; across ChatGPT, Copilot, Cursor, Claude Code, Google Jules, and others. The pattern was remarkably consistent, and the attacks were creative enough that they're worth understanding even in brief:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;GitHub Copilot&lt;/strong&gt; had CamoLeak&lt;sup&gt;&lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/#fn4"&gt;4&lt;/a&gt;&lt;/sup&gt; (CVE-2025-59145, CVSS 9.6), where invisible instructions in a pull request description could make Copilot Chat silently exfiltrate AWS keys and source code through GitHub's own image proxy. A routine code review that leaked your secrets without any visible sign anything was wrong.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Claude Code&lt;/strong&gt; had three separate CVEs from multiple researchers: DNS exfiltration&lt;sup&gt;&lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/#fn5"&gt;5&lt;/a&gt;&lt;/sup&gt; via poisoned project files, automatic code execution&lt;sup&gt;&lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/#fn6"&gt;6&lt;/a&gt;&lt;/sup&gt; through malicious hook commands triggered when you open a project, and API key theft by overriding an environment variable in project config to redirect all API calls to an attacker's server.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Google Jules&lt;/strong&gt; lacked meaningful protections against prompt injection, and Rehberger demonstrated a full attack chain from prompt injection to botnet recruitment. Google triaged it as an "abuse risk" and closed the ticket. Make of that what you will.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The one that really made me sit up, though, was the Ona research&lt;sup&gt;&lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/#fn7"&gt;7&lt;/a&gt;&lt;/sup&gt;. A developer configured Claude Code with a denylist and a sandbox, and the agent found multiple ways around both. It discovered alternative filesystem paths to the same blocked binary, attempted to disable the sandbox when those paths triggered failures, and eventually invoked the Linux dynamic linker directly to bypass binary identification entirely. The agent actively reasoned about how to circumvent its own security controls, not because it was malicious but because the controls were between it and completing the task. That's not a bug. It's an AI doing exactly what AI does, and it's the part that should worry you most.&lt;/p&gt;
&lt;p&gt;Simon Willison (who coined the term "prompt injection") calls this the "lethal trifecta"&lt;sup&gt;&lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/#fn8"&gt;8&lt;/a&gt;&lt;/sup&gt;: an AI agent with access to private data, exposure to untrusted content, and the ability to communicate externally. Every AI coding tool that runs locally on your machine meets all three criteria. A January 2026 paper formalised this into the Promptware Kill Chain&lt;sup&gt;&lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/#fn9"&gt;9&lt;/a&gt;&lt;/sup&gt;, a seven-stage attack framework co-authored by Bruce Schneier and presented in a Black Hat webinar. The finding that stuck with me was that lateral movement (an agent compromised in one context attacking another) grew from zero documented incidents in 2023 to eight in 2025-2026. These tools aren't just targets any more. They're becoming vectors.&lt;/p&gt;
&lt;h2 id="so-whats-standing-between-the-ai-and-your-ssh-keys"&gt;So what's standing between the AI and your SSH keys?&lt;/h2&gt;
&lt;p&gt;Right, so that's the doom and gloom. What's on the other side? What's actually standing between these tools and the worst-case scenarios?&lt;/p&gt;
&lt;p&gt;Most AI coding tools that execute commands on your machine have some combination of three security layers. Not all tools have all three, and the strength varies enormously, but understanding the layers helps you work out what you're actually relying on.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Application-level permissions&lt;/strong&gt; are the first gate. This is the tool asking "shall I run this command?" before executing it. Claude Code has allow/deny lists, Cursor has its approval flow, Copilot has Safe Mode. Same idea, different implementations. The tool checks the command against its rules, and if it's not pre-approved, you get a prompt.&lt;/p&gt;
&lt;p&gt;The problem is that this layer is self-enforced, meaning the application is policing itself. A trojanised installation could show you a permission prompt and execute the command regardless of what you click, and you'd never know because there's no way to verify the enforcement from within the application itself. Even with a legitimate installation, the Ona research showed the agent can reason its way around deny rules by finding alternative filesystem paths to the same binary. Self-enforcement is useful for catching mistakes, but it's not a security boundary.&lt;/p&gt;
&lt;p&gt;The bigger problem is approval fatigue. When you're deep in a task and clicking through permission prompts without really reading them, you've basically disabled the permission system while it's still running.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OS-level sandboxing&lt;/strong&gt; is the second layer, and this is where things get genuinely interesting. Claude Code and Codex CLI both take this seriously, using bubblewrap&lt;sup&gt;&lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/#fn10"&gt;10&lt;/a&gt;&lt;/sup&gt; on Linux and Seatbelt on macOS for kernel-enforced isolation. Cursor caught up with version 2.0 (Landlock and seccomp on Linux). Copilot agent mode has terminal sandboxing on both platforms. The rest are further behind, and some have nothing at all.&lt;/p&gt;
&lt;p&gt;The reason this layer matters is that it's not the application promising to behave. Bubblewrap creates Linux kernel namespaces that give the sandboxed process a restricted view of the filesystem and network, and because it's the operating system kernel preventing access rather than the application, a user-space process can't override it. When Claude Code says you can't read &lt;code&gt;~/.ssh/id_rsa&lt;/code&gt; from within the sandbox, that's the kernel saying no.&lt;/p&gt;
&lt;p&gt;But not everyone has this, and the gap between the leaders and the rest is quite wide. Windsurf relies on user approval prompts and enterprise policy controls with no OS-level sandbox, Aider has nothing built in, and Continue.dev's "Plan Mode" is a UX feature rather than a security boundary. Even among tools that do sandbox, the coverage varies more than you'd expect. Claude Code needs both bubblewrap and socat installed for full filesystem and network isolation (without socat, your domain allowlists aren't actually enforced, which is the kind of thing you only discover when you go looking), and Cursor had a credential leak issue where the sandbox still exposed home directory files&lt;sup&gt;&lt;a href="https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/#fn11"&gt;11&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;I compiled this comparison from each tool's documentation and my own testing, as of March 2026. This field moves fast, so check the latest docs for your tool.&lt;/p&gt;
&lt;table class="table table-bordered mt-4 mb-4"&gt;
&lt;thead&gt;
&lt;tr&gt;&lt;th&gt;Tool&lt;/th&gt;&lt;th&gt;Isolation&lt;/th&gt;&lt;th&gt;Network control&lt;/th&gt;&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;Claude Code&lt;/td&gt;&lt;td&gt;bubblewrap / Seatbelt (OS-level)&lt;/td&gt;&lt;td&gt;Proxy-based domain allowlist&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Cursor 2.0&lt;/td&gt;&lt;td&gt;Seatbelt / Landlock + seccomp&lt;/td&gt;&lt;td&gt;Permission prompts for external access&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Copilot agent mode&lt;/td&gt;&lt;td&gt;Terminal sandboxing (experimental)&lt;/td&gt;&lt;td&gt;All network blocked when sandboxed&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;OpenAI Codex CLI&lt;/td&gt;&lt;td&gt;Seatbelt / seccomp + Landlock&lt;/td&gt;&lt;td&gt;Restricted&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;OpenAI Codex (cloud)&lt;/td&gt;&lt;td&gt;Isolated containers&lt;/td&gt;&lt;td&gt;Internet off by default&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Devin&lt;/td&gt;&lt;td&gt;Cloud sandbox&lt;/td&gt;&lt;td&gt;Cloud-managed&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Amazon Q&lt;/td&gt;&lt;td&gt;Docker containers&lt;/td&gt;&lt;td&gt;IAM-managed&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Windsurf&lt;/td&gt;&lt;td&gt;Policy and approval prompts only&lt;/td&gt;&lt;td&gt;Configuration-based&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Aider&lt;/td&gt;&lt;td&gt;None&lt;/td&gt;&lt;td&gt;None&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Continue.dev&lt;/td&gt;&lt;td&gt;None&lt;/td&gt;&lt;td&gt;None&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Then there are plain &lt;strong&gt;OS file permissions&lt;/strong&gt;, the standard Unix permissions enforced by the kernel. These are absolute within their scope and the strongest guarantee you have. But the scope is narrower than you'd think. Claude Code runs as your user account, so it can't touch another user's files or modify system files without sudo. That's real protection against privilege escalation. But your SSH keys, your browser data, your email, your cloud credentials, your shell config? Your account owns all of that, and OS permissions won't stop the tool from reading any of it. Everything in your home directory is fair game unless one of the other layers blocks it first.&lt;/p&gt;
&lt;p&gt;Basically, no single layer is enough. You need all three working together, and you need to know what each one actually covers, because the gaps between them are where the real risk lives.&lt;/p&gt;
&lt;h2 id="what-were-doing-about-it"&gt;What we're doing about it&lt;/h2&gt;
&lt;p&gt;I &lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/"&gt;run AI agents across dozens of projects&lt;/a&gt; and I &lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/"&gt;wrote earlier this year&lt;/a&gt; about how the multi-agent architecture avoids the kind of problems that hit OpenClaw, mostly through design choices like session-only agents (nothing running between sessions, so no daemon to hijack), &lt;a href="https://anothercoffee.net/secure-your-ai-workflow-using-local-tokenisation/"&gt;encrypted credentials&lt;/a&gt; via &lt;code&gt;pass&lt;/code&gt; and GPG rather than plaintext files, file-based coordination through text memos instead of shared API keys, and per-project isolation so a compromise in one project stays there. Those architectural properties are the foundation, and they still hold. But the incidents from the past year pushed me to add the runtime security layers on top, because good architecture doesn't help if the tool on your machine can read everything your user account can.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;Good architecture doesn't help if the tool on your machine can read everything your user account can.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;I run Claude Code with bubblewrap and socat on Arch Linux. The sandbox is on globally with the escape hatch disabled, meaning agents can't retry failed commands without sandboxing even if they want to. Sensitive paths are blocked at the kernel level. Private keys, GPG keyring, password store, cloud credentials are all on the deny-read list. Shell configs, the SSH directory, and the sandbox settings themselves are write-protected so an agent can't weaken its own restrictions. Network access from sandboxed commands is restricted to GitHub and package registries by default, with project-level overrides only where I've made a deliberate decision that a specific project needs access to a specific domain.&lt;/p&gt;
&lt;p&gt;Commands like SSH and Docker that can't work inside a network namespace are excluded from the sandbox but still go through the permission layer. That's a weaker gate for those commands and I know it, but it's the trade-off: SSH needs real network access to reach remote hosts, and there's no way to sandbox that while keeping it functional. So I accept the weaker control for specific commands and tighten everything else around them.&lt;/p&gt;
&lt;p&gt;The sudo thing is worth mentioning because I learned it the hard way. One of my agents got into a loop trying to run sudo commands. It couldn't authenticate (there's no interactive terminal for password entry, which is actually a natural protection), but it kept trying, and the repeated failures triggered &lt;code&gt;pam_faillock&lt;/code&gt; and locked the account. I had to clear the lockout manually, and of course this happened in the middle of something urgent. The lesson isn't just "don't configure passwordless sudo" (though seriously, don't, because &lt;code&gt;NOPASSWD: ALL&lt;/code&gt; gives a compromised agent full root access to your machine). It's that even failed sudo attempts have consequences, and the inability to use sudo is a feature, not a bug.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;The inability to use sudo is a feature, not a bug.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;And the approval fatigue problem is real. I've caught myself clicking "yes" to permission prompts without reading them because I'm focused on the actual work, and then glancing at what I'd just approved and realising the agent was about to delete something it shouldn't or overwrite a file I hadn't backed up. That jolt of "wait, what did I just approve?" is not a good feeling, and it's what pushed me toward auto-allow mode with a properly configured sandbox rather than relying on manual approval for everything. The sandbox handles the enforcement; the prompts are a secondary check for things that fall outside it.&lt;/p&gt;
&lt;p&gt;It took longer than I'd like to admit to get all of this working together without breaking the actual workflow. But that's sort of the price of taking it seriously, and now that it's in place, the day-to-day experience is genuinely better than it was when I was relying on permission prompts alone.&lt;/p&gt;
&lt;h2 id="a-practical-ai-coding-tool-security-checklist"&gt;A practical AI coding tool security checklist&lt;/h2&gt;
&lt;p&gt;If you're using AI coding tools in a business context, here's what we'd recommend regardless of which tool you're on. This isn't theory; it's what I actually did, and the order roughly reflects priority.&lt;/p&gt;
&lt;table class="table table-bordered mt-4 mb-4"&gt;
&lt;thead&gt;
&lt;tr&gt;&lt;th style="width:5%"&gt;&lt;/th&gt;&lt;th style="width:35%"&gt;Action&lt;/th&gt;&lt;th&gt;Why it matters&lt;/th&gt;&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Verify your installation source&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Every protection below assumes a legitimate installation. A trojanised tool can fake every prompt and status indicator. Install from official channels, verify checksums, keep things updated.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Enable OS-level sandboxing&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Claude Code has a &lt;code&gt;/sandbox&lt;/code&gt; command. Cursor 2.0 has agent sandboxing. Copilot has terminal sandboxing. If your tool doesn't offer it (Windsurf, Aider, Continue.dev), you're relying entirely on application-level controls.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Install socat (Linux)&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;bubblewrap handles filesystem isolation but socat is needed for network domain filtering. Without it, your allowlists aren't enforced.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Block read access to sensitive paths&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Deny-read &lt;code&gt;~/.ssh/id_*&lt;/code&gt;, &lt;code&gt;~/.ssh/*_rsa&lt;/code&gt;, GPG keyring, password store, cloud credentials. SSH still works via ssh-agent. You lose nothing.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Write-protect shell and tool configs&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Deny-write &lt;code&gt;~/.bashrc&lt;/code&gt;, &lt;code&gt;~/.zshrc&lt;/code&gt;, &lt;code&gt;~/.ssh/&lt;/code&gt;, and the sandbox settings file itself, so an agent can't weaken its own restrictions.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;6&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Disable the sandbox escape hatch&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Claude Code's &lt;code&gt;allowUnsandboxedCommands&lt;/code&gt; setting lets agents retry failed commands without sandboxing. Turn it off. A command failing inside the sandbox is the sandbox doing its job.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;7&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Never configure passwordless sudo&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;AI tools can't use sudo interactively (no TTY). That's a natural protection. &lt;code&gt;NOPASSWD: ALL&lt;/code&gt; removes it entirely and gives a compromised agent full root access.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Restrict network to known domains&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Allow GitHub, package registries, and whatever specific services your project needs. Everything else gets blocked at the sandbox level.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;9&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Know which commands bypass the sandbox&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;SSH, Docker, and similar tools need real network access and typically run outside the sandbox. They still go through permission rules, but that's a weaker gate.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;10&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Review project config files like code&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Multiple CVEs used &lt;code&gt;.claude/settings.json&lt;/code&gt;, hooks, MCP configs, and environment overrides as attack vectors. Opening an untrusted repo is now the new "don't run untrusted executables."&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;11&lt;/td&gt;&lt;td&gt;&lt;strong&gt;Verify the sandbox is actually running&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;On Linux: &lt;code&gt;ps aux | grep bwrap&lt;/code&gt; during a session. If there are no bubblewrap processes while commands are executing, something is wrong.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Is this list perfect? No, and honestly the field is moving fast enough that it'll need updating. But it's a concrete starting point, and it's where I landed after working through the incidents and research above.&lt;/p&gt;
&lt;p&gt;If you're not sure what your security posture actually looks like (or you suspect the answer is "whatever the defaults were"), that's the kind of thing we help with. We've been running this setup across dozens of projects for over a year, and we're happy to &lt;a href="https://anothercoffee.net/contact/"&gt;have a conversation&lt;/a&gt; about what would work for your situation.&lt;/p&gt;
&lt;section class="mt-4 pt-4"&gt;
&lt;h3 class="text-center pb-4"&gt;Common Questions&lt;/h3&gt;
&lt;div class="container border bg-light p-4"&gt;
&lt;p&gt;&lt;strong&gt;Can AI coding tools read my SSH keys?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Yes, unless you've configured sandboxing to block it. AI coding tools run as your user account, which means they have the same file access you do. Your SSH keys, cloud credentials, browser data, and shell config are all readable by default. OS-level sandboxing with explicit deny-read rules is the only way to prevent it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Which AI coding tools have sandboxing?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As of March 2026, Claude Code, Cursor 2.0, Copilot agent mode, and OpenAI Codex CLI all offer some form of OS-level sandboxing. Windsurf, Aider, and Continue.dev rely on application-level controls or have no sandboxing at all. See the comparison table above for details.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Is the permission prompt enough to keep me safe?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;On its own, no. Permission prompts are self-enforced by the application, which means a compromised tool could bypass them entirely. Even with a legitimate installation, approval fatigue is a real problem. OS-level sandboxing provides kernel-enforced protection that doesn't depend on you clicking the right button every time.&lt;/p&gt;
&lt;/div&gt;
&lt;/section&gt;

&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;This article is part of &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;an ongoing series&lt;/a&gt; on how Another Cup of Coffee is adapting to AI. &lt;a href="https://anothercoffee.net/categories/ai/"&gt;Explore all articles in this series&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;div class="mt-5"&gt;
    &lt;h3&gt;You may also like&lt;/h3&gt;

    &lt;div class="row"&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/openclaw-security-card-300x150.jpg" class="card-img-top" alt="Red lobster on a white plate"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/" class="listtitle"&gt;What OpenClaw Teaches Us About AI Agent Security&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2026-02-22T12:00:00Z" title="22 February 2026"&gt;22 February 2026&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;OpenClaw's security crisis exposed real problems with how AI agents handle credentials, plugins, and system access. Here's what went wrong and how a convention-based approach avoids these risks entirely.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/run-dozens-of-projects-ai-card-300x150.jpg" class="card-img-top" alt="One person running dozens of projects with AI agents"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/" class="listtitle"&gt;I Run Dozens of Projects with AI. The Hard Part Isn't the AI.&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2025-12-20T12:00:00Z" title="20 December 2025"&gt;20 December 2025&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;One person, dozens of projects, four AI vendors. I spent a year building a coordination system for AI agents. The components are simple. Getting them right was not.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/building-an-operating-environment-for-ai-agents/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/aoe-howibuild-card-300x150.jpg" class="card-img-top" alt="Building an Operating Environment for AI Agents"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/building-an-operating-environment-for-ai-agents/" class="listtitle"&gt;Building an Operating Environment for AI Agents&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2025-05-15T13:20:00Z" title="15 May 2025"&gt;15 May 2025&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;How markdown files and conventions turned CLI agent tools into a coordination system running 44 projects across 14 organisations. No framework required.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;div class="mt-4 pt-4 text-muted small border-top border-bottom"&gt;
    &lt;h3 class="text-muted small"&gt;Footnotes&lt;/h3&gt;
    &lt;ol&gt;
      &lt;li id="fn1"&gt;Stack Overflow, &lt;a href="https://survey.stackoverflow.co/2025/" target="_blank" rel="nofollow noopener noreferrer"&gt;2025 Developer Survey&lt;/a&gt;.&lt;/li&gt;
      &lt;li id="fn2"&gt;Ari Marzouk, &lt;a href="https://maccarita.com/posts/idesaster/" target="_blank" rel="nofollow noopener noreferrer"&gt;IDEsaster: A Novel Vulnerability Class in AI IDEs&lt;/a&gt;, MaccariTA, December 2025.&lt;/li&gt;
      &lt;li id="fn3"&gt;Simon Willison, &lt;a href="https://simonwillison.net/2025/Aug/15/the-summer-of-johann/" target="_blank" rel="nofollow noopener noreferrer"&gt;The Summer of Johann&lt;/a&gt;, August 2025.&lt;/li&gt;
      &lt;li id="fn4"&gt;Legit Security, &lt;a href="https://www.legitsecurity.com/blog/camoleak-critical-github-copilot-vulnerability-leaks-private-source-code/" target="_blank" rel="nofollow noopener noreferrer"&gt;CamoLeak: Critical GitHub Copilot Vulnerability Leaks Private Source Code&lt;/a&gt;.&lt;/li&gt;
      &lt;li id="fn5"&gt;Johann Rehberger, &lt;a href="https://embracethered.com/blog/posts/2025/claude-code-exfiltration-via-dns-requests/" target="_blank" rel="nofollow noopener noreferrer"&gt;Claude Code: Exfiltration via DNS Requests&lt;/a&gt;, Embrace The Red.&lt;/li&gt;
      &lt;li id="fn6"&gt;Check Point Research, &lt;a href="https://research.checkpoint.com/2026/rce-and-api-token-exfiltration-through-claude-code-project-files-cve-2025-59536/" target="_blank" rel="nofollow noopener noreferrer"&gt;RCE and API Token Exfiltration through Claude Code Project Files&lt;/a&gt;.&lt;/li&gt;
      &lt;li id="fn7"&gt;Ona, &lt;a href="https://ona.com/stories/how-claude-code-escapes-its-own-denylist-and-sandbox" target="_blank" rel="nofollow noopener noreferrer"&gt;How Claude Code Escapes Its Own Denylist and Sandbox&lt;/a&gt;.&lt;/li&gt;
      &lt;li id="fn8"&gt;Simon Willison, &lt;a href="https://simonw.substack.com/p/the-lethal-trifecta-for-ai-agents" target="_blank" rel="nofollow noopener noreferrer"&gt;The Lethal Trifecta for AI Agents&lt;/a&gt;.&lt;/li&gt;
      &lt;li id="fn9"&gt;Promptware Kill Chain, &lt;a href="https://arxiv.org/abs/2601.09625" target="_blank" rel="nofollow noopener noreferrer"&gt;arxiv 2601.09625&lt;/a&gt;, January 2026. Co-authored by Bruce Schneier; presented in a Black Hat webinar.&lt;/li&gt;
      &lt;li id="fn10"&gt;&lt;a href="https://github.com/containers/bubblewrap" target="_blank" rel="nofollow noopener noreferrer"&gt;bubblewrap&lt;/a&gt; on GitHub. Unprivileged sandboxing tool using Linux kernel namespaces.&lt;/li&gt;
      &lt;li id="fn11"&gt;Luca Becker, &lt;a href="https://luca-becker.me/blog/cursor-sandboxing-leaks-secrets/" target="_blank" rel="nofollow noopener noreferrer"&gt;When Sandboxing Leaks Your Secrets&lt;/a&gt;.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Featured image photo by &lt;a href="https://unsplash.com/@elijahjcobb?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Elijah Cobb&lt;/a&gt; on &lt;a href="https://unsplash.com/photos/small-island-church-connected-by-a-walkway-0xD0PKrtil4?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Unsplash&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;</description><category>AI</category><category>AI Security</category><category>Claude Code</category><category>Copilot</category><category>Cursor</category><category>Developer Tools</category><category>Prompt Injection</category><category>Sandboxing</category><category>Security</category><guid>https://anothercoffee.net/trust-but-verify-ai-coding-tool-security/</guid><pubDate>Thu, 12 Mar 2026 12:00:00 GMT</pubDate></item><item><title>What OpenClaw Teaches Us About AI Agent Security</title><link>https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/</link><dc:creator>Aiden</dc:creator><description>&lt;figure&gt;&lt;img src="https://anothercoffee.net/images/posts/openclaw-security-og-1200x630.jpg"&gt;&lt;/figure&gt; &lt;p class="intro"&gt;OpenClaw went from zero to 180,000 GitHub stars in a matter of weeks. Then the security reports started arriving.&lt;/p&gt;

&lt;p&gt;In early February 2026, researchers disclosed CVE-2026-25253&lt;sup&gt;&lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/#fn1"&gt;1&lt;/a&gt;&lt;/sup&gt;: a one-click remote code execution vulnerability that could compromise any OpenClaw instance, even ones bound to localhost. Within days, independent scans found tens of thousands of exposed instances&lt;sup&gt;&lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/#fn2"&gt;2&lt;/a&gt;&lt;/sup&gt; across dozens of countries. Over 93% of verified instances had critical authentication bypass vulnerabilities.&lt;/p&gt;
&lt;p&gt;We've been building our own multi-agent system at Another Cup of Coffee, a set of &lt;a href="https://anothercoffee.net/building-an-operating-environment-for-ai-agents/"&gt;file-based conventions&lt;/a&gt; that let AI agents coordinate across projects and organisations. Honestly, we watched those disclosures land with a sort of guilty relief. Sympathy too, because building in public is hard and getting torn apart on security is painful. But mostly relief, because the problems OpenClaw exposed are exactly the ones we'd been paranoid about from the start.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;In short:&lt;/strong&gt; OpenClaw's architecture had plaintext credentials, an unvalidated WebSocket gateway, and a plugin marketplace where 20% of submissions were malware. We build AI agents differently: no persistent services, encrypted credentials via &lt;code&gt;pass&lt;/code&gt; and GPG, file-based coordination through text memos, and layered defences from instruction files to hooks to VM isolation. This article breaks down what went wrong and how a convention-based approach avoids these risks.&lt;/p&gt;
&lt;h2 id="how-openclaw-blew-up"&gt;How OpenClaw Blew Up&lt;/h2&gt;
&lt;p&gt;OpenClaw tried to turn an AI agent into a personal operating system. Browser automation, shell commands, cron jobs, inbox management, 50+ service integrations. All controlled through messaging platforms like WhatsApp and Telegram. Fair enough on the ambition. But the way they built it was a disaster.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Plaintext credentials everywhere.&lt;/strong&gt; OpenClaw stored API keys, OAuth tokens, and other secrets in plaintext Markdown and JSON files, sitting in &lt;code&gt;~/.openclaw/&lt;/code&gt; where any process on the machine could read them. Security researcher Jamieson O'Reilly of Dvuln demonstrated access to Anthropic API keys, Telegram bot tokens, Slack credentials, and full chat histories from exposed instances. Just sitting there in a dot-directory, unencrypted and readable by any process on the box.&lt;/p&gt;
&lt;p&gt;The WebSocket vulnerability was arguably worse. The CVE-2026-25253 attack chain worked because OpenClaw's server didn't validate origin headers, so a victim clicking a single malicious link was enough to hijack their agent's gateway and get full command execution with the agent's system permissions. Localhost binding didn't help, because the attack pivoted through the victim's own browser. One click, game over.&lt;/p&gt;
&lt;p&gt;Then there was ClawHub. An initial audit identified 341 malicious skills&lt;sup&gt;&lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/#fn3"&gt;3&lt;/a&gt;&lt;/sup&gt; in the plugin marketplace. Follow-up scans pushed the total past 800, roughly 20% of the entire registry. The primary payload was Atomic macOS Stealer, harvesting passwords, SSH keys, and cryptocurrency wallets. The only barrier to publishing a skill was a GitHub account older than one week. Twenty percent. Think about that for a second.&lt;/p&gt;
&lt;p&gt;Simon Willison, who also coined the term "prompt injection", calls it the "lethal trifecta" because it combines access to private data with exposure to untrusted content and the ability to communicate externally. OpenClaw had all three by design.&lt;/p&gt;
&lt;h2 id="its-not-just-openclaw"&gt;It's Not Just OpenClaw&lt;/h2&gt;
&lt;p&gt;OpenClaw moved fast, skipped security basics, and paid the price. Fair enough. But if you're thinking "well, I don't use OpenClaw, so this doesn't apply to me," we'd push back on that.&lt;/p&gt;
&lt;p&gt;The deeper issue is architectural, and it's becoming common. AI agents that run as always-on services with broad system access, centralised credential stores, and third-party plugin ecosystems. Every one of those design choices creates attack surface, and the same pattern shows up in other agent frameworks that want to be platforms. Accumulating capabilities, running background services, storing secrets, trusting marketplace content. The more capable the agent becomes, the more damage a single compromise can do.&lt;/p&gt;
&lt;h2 id="what-we-actually-do"&gt;What We Actually Do&lt;/h2&gt;
&lt;p&gt;Our system works on different assumptions. We call it an Agentic Operating Environment, and internally we have components with names like "multi-agent-framework" and "project-coordinator" (we're not great at branding). The security properties come from the architecture, not the naming.&lt;/p&gt;
&lt;p&gt;The biggest difference is that our agents don't run between sessions. There's no gateway to hijack, no WebSocket to exploit, no daemon listening on a port, and when a session ends nothing is running. The workstation itself is LUKS-encrypted, and SSH runs on a non-standard port with key-only authentication so password login is disabled entirely. The entire attack surface of CVE-2026-25253 simply doesn't exist because there's no service to hijack.&lt;/p&gt;
&lt;p&gt;Where OpenClaw dumps API keys into plaintext Markdown files in &lt;code&gt;~/.openclaw/&lt;/code&gt;, we use &lt;code&gt;pass&lt;/code&gt; (the standard Unix password manager, been around for years, boring and reliable) backed by GPG encryption. API keys reach agents through environment variables, never through files in the project tree. No dot-directory full of plaintext secrets sitting there for malware to harvest.&lt;/p&gt;
&lt;table class="table table-bordered mt-4 mb-4"&gt;
&lt;thead&gt;
&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;OpenClaw&lt;/th&gt;&lt;th&gt;Our approach&lt;/th&gt;&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Runtime&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Always-on daemon with WebSocket gateway&lt;/td&gt;&lt;td&gt;Session-only, nothing running between sessions&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Credentials&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Plaintext in &lt;code&gt;~/.openclaw/&lt;/code&gt;&lt;/td&gt;&lt;td&gt;GPG-encrypted via &lt;code&gt;pass&lt;/code&gt;, injected as environment variables&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Plugins&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;ClawHub marketplace (20% malware at audit)&lt;/td&gt;&lt;td&gt;Capabilities come from the vendor (Claude Code, Codex, Gemini CLI) and instruction files&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Isolation&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Single agent, access to everything&lt;/td&gt;&lt;td&gt;Per-project boundaries, optionally on separate hardware or VMs&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Audit trail&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;None by default&lt;/td&gt;&lt;td&gt;Every file operation is a git commit&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;h3 id="conventions-hooks-and-the-agent-that-went-rogue"&gt;Conventions, Hooks, and the Agent That Went Rogue&lt;/h3&gt;
&lt;p&gt;We should be honest about something, though. AI agents overstep. It's not theoretical. Early on, one of our agents decided to "help" by reorganising files across a project it had no business touching. No malice, no exploit, just an agent that interpreted its instructions broadly and started tidying up someone else's work. We caught it in the git diff, reverted it, and spent the rest of that day (and most of the evening, honestly) writing stricter instruction files. That's the moment we stopped trusting conventions on their own.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;No malice, no exploit, just an agent that interpreted its instructions broadly and started tidying up someone else's work.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;So now we layer them. Instruction files tell agents to be read-only by default and to confirm before writing, so it's convention rather than enforcement, but conventions that the agent reads at the start of every session. On top of that, Claude Code's permission system lets us configure allow/deny lists controlling which tools an agent can use and which paths it can touch.&lt;/p&gt;
&lt;p&gt;Where conventions aren't enough, we use hooks, which are scripts that intercept commands before they execute. Our email guard hook is a good example. It parses every Bash command for mail binaries, catches evasion attempts through subshells and command substitution, and blocks them unconditionally. If it can't parse the input, it blocks anyway. Fail closed, not open (we learned that one the hard way).&lt;/p&gt;
&lt;p&gt;None of these layers is absolute on its own. But that's sort of the point.&lt;/p&gt;
&lt;h3 id="text-files-cant-execute-code"&gt;Text Files Can't Execute Code&lt;/h3&gt;
&lt;p&gt;The rest follows from one simple property of our architecture: agents communicate through text files. A &lt;code&gt;STATE.md&lt;/code&gt; can't open a reverse shell and a memo can't install a rootkit. Now, text files aren't completely harmless (prompt injection is real, and a poisoned memo could try to manipulate an agent into doing something it shouldn't), but compare that attack surface to executable plugins with system access. It's a fundamentally narrower target.&lt;/p&gt;
&lt;p&gt;Each project is its own boundary, too. An agent working on one client's web development never sees another client's data, never reads their state files, never processes their memos. Where OpenClaw's single agent had access to everything, our &lt;a href="https://anothercoffee.net/building-an-operating-environment-for-ai-agents/"&gt;mesh architecture&lt;/a&gt; means a compromise in one project stops at that project's directory. Cross-project coordination happens through structured markdown memos (just text files with checkboxes, nothing fancy), and since every file operation shows up in git, any violation is immediately visible. That same git history gives us an audit trail almost for free. We're adding more layers too (blocked commands now go to syslog, email drafts sit in a review queue until a human approves them) but honestly, when every change is already a git commit, you're most of the way there without trying.&lt;/p&gt;
&lt;h3 id="when-conventions-arent-enough"&gt;When Conventions Aren't Enough&lt;/h3&gt;
&lt;p&gt;Conventions and hooks are good. But an agent with shell access can, in principle, ignore every convention file it reads.&lt;/p&gt;
&lt;p&gt;The thing is, our mesh architecture already handles part of this. Nothing requires projects to sit on the same machine. Each project is an autonomous node that communicates through text files, so you can run different projects on different physical hardware and the coordination still works through memos. An agent on our workstation sends a memo to a project directory on a separate NUC across the network (yes, Samba, it's not glamorous but it works), and the receiving agent picks it up at its next session. Physical isolation between projects without changing anything about how the system works. And you can go further: configure the Samba share to only expose the &lt;code&gt;memos/incoming/&lt;/code&gt; directory, not the full project tree, and the sending agent gets a narrow write-only channel. It can't see the receiving project's source code, state files, or client data. It can't even list what other memos are already sitting there. A one-way letterbox between machines, which is a much harder boundary than "the agent promises to only read its own files."&lt;/p&gt;
&lt;p&gt;&lt;svg viewbox="0 0 580 260" xmlns="http://www.w3.org/2000/svg" role="img" aria-label="Physical isolation: workstation sends memos via scoped Samba share to a NUC, with no access to the receiving project's files" style="max-width: 580px; margin: 1.5em auto; display: block; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;"&gt;
  &lt;defs&gt;
    &lt;marker id="iso-arrow" viewbox="0 0 10 8" refx="10" refy="4" markerwidth="8" markerheight="6" orient="auto"&gt;
      &lt;path d="M0,0 L10,4 L0,8 Z" fill="#aaa"&gt;&lt;/path&gt;
    &lt;/marker&gt;
  &lt;/defs&gt;
  &lt;rect x="10" y="10" width="200" height="240" rx="8" fill="none" stroke="#555" stroke-width="1.5" stroke-dasharray="6,3"&gt;&lt;/rect&gt;
  &lt;text x="110" y="34" text-anchor="middle" font-size="13" fill="#555" font-weight="bold"&gt;Workstation&lt;/text&gt;
  &lt;rect x="30" y="50" width="160" height="40" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="110" y="75" text-anchor="middle" font-size="13" fill="#333"&gt;client-a-web&lt;/text&gt;
  &lt;rect x="30" y="110" width="160" height="40" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="110" y="135" text-anchor="middle" font-size="13" fill="#333"&gt;coordinator&lt;/text&gt;
  &lt;rect x="30" y="170" width="160" height="40" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="110" y="195" text-anchor="middle" font-size="13" fill="#333"&gt;sysadmin&lt;/text&gt;
  &lt;rect x="370" y="10" width="200" height="240" rx="8" fill="none" stroke="#555" stroke-width="1.5" stroke-dasharray="6,3"&gt;&lt;/rect&gt;
  &lt;text x="470" y="34" text-anchor="middle" font-size="13" fill="#555" font-weight="bold"&gt;NUC&lt;/text&gt;
  &lt;rect x="390" y="50" width="160" height="40" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="470" y="75" text-anchor="middle" font-size="13" fill="#333"&gt;client-b-web&lt;/text&gt;
  &lt;rect x="390" y="120" width="160" height="36" rx="5" fill="#fff5f5" stroke="#d4576b" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="470" y="143" text-anchor="middle" font-size="12" fill="#d4576b"&gt;memos/incoming/&lt;/text&gt;
  &lt;rect x="390" y="180" width="160" height="50" rx="5" fill="#f9f9f9" stroke="#ccc" stroke-width="1" stroke-dasharray="4,3"&gt;&lt;/rect&gt;
  &lt;text x="470" y="200" text-anchor="middle" font-size="11" fill="#bbb"&gt;source, state, data&lt;/text&gt;
  &lt;text x="470" y="216" text-anchor="middle" font-size="11" fill="#bbb"&gt;(not shared)&lt;/text&gt;
  &lt;line x1="198" y1="138" x2="382" y2="138" stroke="#d4576b" stroke-width="2" marker-end="url(#iso-arrow)"&gt;&lt;/line&gt;
  &lt;text x="290" y="120" text-anchor="middle" font-size="12" fill="#999"&gt;Samba (write-only)&lt;/text&gt;
  &lt;text x="340" y="205" text-anchor="middle" font-size="18" fill="#ccc"&gt;✕&lt;/text&gt;
&lt;/svg&gt;&lt;/p&gt;
&lt;p&gt;Of course, dedicating a physical machine to every project that needs isolation isn't always practical. For those cases, we can deploy KVM virtual machines on the same hardware instead. A Debian guest with no shared folders and no host filesystem access. SSH-only from the workstation, key-based auth, nothing else. The agent works inside the VM as if it were a standalone machine. If a session goes wrong, you roll back the entire VM state to a snapshot and it's like it never happened.&lt;/p&gt;
&lt;p&gt;Docker gets you filesystem isolation, but you're still sharing the kernel and the snapshot story isn't as clean. A full VM is a harder boundary. It's more overhead, sure, but for sessions where an agent has broad shell access and you're experimenting with something new, I'd rather have that overhead than spend an evening working out what it changed.&lt;/p&gt;
&lt;p&gt;This isn't security through obscurity. It's security through &lt;em&gt;reduction and layered defence&lt;/em&gt;. Fewer moving parts, encrypted credentials, conventions backed by hooks and permissions, and VM isolation when you need a harder boundary. The audit trail is baked into the architecture rather than bolted on after the fact.&lt;/p&gt;
&lt;h2 id="what-we-give-up"&gt;What We Give Up&lt;/h2&gt;
&lt;p&gt;Our approach gives up things that OpenClaw offered. We don't have 50+ service connectors. We can't trigger browser automation from Telegram or manage a calendar from WhatsApp. Inbox management is possible, but it needs explicit configuration per project. One of ours has access to a dedicated Gmail address through standard IMAP sync, not to anyone's personal inbox. That's a scoped, session-only capability on a dedicated address, not always-on access to your entire digital life. The operational overhead of keeping an always-on agent secured across fifty integration points isn't something we're eager to take on.&lt;/p&gt;
&lt;p&gt;For hobbyists and developers who enjoy living on the bleeding edge, those features are the whole point. OpenClaw's popularity proved there's genuine demand for an AI agent that lives in your messaging apps and manages your digital life. And if something goes wrong, you reinstall and move on.&lt;/p&gt;
&lt;p&gt;But if you're running a business? The calculus is completely different. An agent with access to your email, your client files, your invoicing, your calendar, fifty service integrations, and it gets compromised or just makes a stupid mistake? That's not a "reinstall and move on" situation. An email sent to the wrong client, a file deleted from a live project, an API key leaked that gives someone access to your payment processor. For a sole trader or a small agency, any one of those could be genuinely catastrophic.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;Anyone who's run a small business knows that client trust is hard to build and easy to destroy, and it takes far less than a data breach to lose it.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;So for business operations, for coordinating work across projects, tracking what needs to happen next, and maintaining continuity between sessions, the convention-based approach is both simpler and more secure. You don't need a persistent service when a markdown file does the same job. You don't need a plugin marketplace when the agent's capabilities come from its vendor (Claude Code, Codex, Gemini CLI) and your instruction files. And you definitely don't need fifty integration points when you can't guarantee the security of any of them.&lt;/p&gt;
&lt;p&gt;And no, we're not saying just do everything manually. The always-on model is genuinely useful and OpenClaw's popularity proves the demand is real. The problem isn't automation, it's &lt;em&gt;unscoped&lt;/em&gt; automation. An always-on agent with plaintext credentials, no origin validation, and fifty unsecured integration points is a different thing entirely from a systemd timer that kicks off a specific agent session at a scheduled time with scoped permissions.&lt;/p&gt;
&lt;p&gt;We use systemd timers (Arch Linux's equivalent of cron) for exactly this. Our email sender and backup archives both run on timers. These are automated, they run without us, but each one does a specific job with specific access. Adding an agent session that triggers on a schedule or an event is the same principle. The difference from OpenClaw is that it's a deliberate decision each time: this agent, this scope, this schedule, these permissions. Not "here are the keys to everything, run forever."&lt;/p&gt;
&lt;h2 id="so-whats-the-takeaway"&gt;So What's the Takeaway?&lt;/h2&gt;
&lt;p&gt;Basically, nobody's saying your agents shouldn't be automated. But how much access they get, and whether you actually decided to give them that access or it just came switched on by default, matters a lot more than most people realise. Persistent services with broad permissions are a liability, not a feature. But scoped automation with layered defences is fine, and it's where we're heading too.&lt;/p&gt;
&lt;p&gt;After that, keep credentials out of your agent's file system. Tools like &lt;code&gt;pass&lt;/code&gt;, system keychains, and environment variables exist for a reason and they're not hard to set up. The moment secrets land in plaintext files inside a dot-directory, every piece of malware on the machine can read them.&lt;/p&gt;
&lt;p&gt;And be sceptical of agent plugin ecosystems. Yes, marketplaces are convenient, but they inherit all the security problems of package registries, with the added risk that AI agents often run with elevated system access. If 20% of a marketplace is malware within weeks of launch, the vetting model is broken. There's no polite way to say that.&lt;/p&gt;
&lt;p&gt;These aren't theoretical concerns any more. OpenClaw proved they're practical ones, at scale, with real consequences for real users. If you're running AI agents in your workflow and you haven't thought about these failure modes, maybe don't wait for your own OpenClaw moment to find out.&lt;/p&gt;
&lt;hr&gt;
&lt;div class="mt-5"&gt;
    &lt;h3&gt;You may also like&lt;/h3&gt;

    &lt;div class="row"&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;
                        &lt;img src="https://anothercoffee.net/images/a-west-london-micro-agencys-journey-to-ai-featured.jpg" class="card-img-top" alt="Coffee and a laptop with ChatGPT"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/" class="listtitle"&gt;Still Alive: A Micro Agency's 20 Year Journey&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2024-10-15T15:28:15Z" title="15 October 2024"&gt;15 October 2024&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;This article will be the first in a series where I'll share how Artificial Intelligence has reshaped how we operate at Another Cup of Coffee.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/building-an-operating-environment-for-ai-agents/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/aoe-howibuild-card-300x150.jpg" class="card-img-top" alt="Building an Operating Environment for AI Agents"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/building-an-operating-environment-for-ai-agents/" class="listtitle"&gt;Building an Operating Environment for AI Agents&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2025-05-15T13:20:00Z" title="15 May 2025"&gt;15 May 2025&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;How markdown files and conventions turned CLI agent tools into a coordination system running 44 projects across 14 organisations. No framework required.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/run-dozens-of-projects-ai-card-300x150.jpg" class="card-img-top" alt="One person running dozens of projects with AI agents"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/" class="listtitle"&gt;I Run Dozens of Projects with AI. The Hard Part Isn't the AI.&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2025-12-20T12:00:00Z" title="20 December 2025"&gt;20 December 2025&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;One person, dozens of projects, four AI vendors. I spent a year building a coordination system for AI agents. The components are simple. Getting them right was not.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;div class="mt-4 pt-4 text-muted small border-top border-bottom"&gt;
    &lt;h3 class="text-muted small"&gt;Footnotes&lt;/h3&gt;
    &lt;ol&gt;
      &lt;li id="fn1"&gt;SOCRadar, &lt;a href="https://socradar.io/blog/cve-2026-25253-rce-openclaw-auth-token/" target="_blank" rel="nofollow noopener noreferrer"&gt;CVE-2026-25253: RCE in OpenClaw Auth Token&lt;/a&gt;.&lt;/li&gt;
      &lt;li id="fn2"&gt;Infosecurity Magazine, &lt;a href="https://www.infosecurity-magazine.com/news/researchers-40000-exposed-openclaw/" target="_blank" rel="nofollow noopener noreferrer"&gt;Researchers Find 40,000 Exposed OpenClaw Instances&lt;/a&gt;.&lt;/li&gt;
      &lt;li id="fn3"&gt;The Hacker News, &lt;a href="https://thehackernews.com/2026/02/researchers-find-341-malicious-clawhub.html" target="_blank" rel="nofollow noopener noreferrer"&gt;Researchers Find 341 Malicious ClawHub Skills&lt;/a&gt;.&lt;/li&gt;
    &lt;/ol&gt;
    &lt;p&gt;Featured image photo by &lt;a href="https://unsplash.com/@davidtoddmccarty?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText" target="_blank" rel="nofollow noopener noreferrer"&gt;David Todd McCarty&lt;/a&gt; on &lt;a href="https://unsplash.com/photos/red-lobster-on-white-ceramic-plate-OrTjocYe1b4?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText" target="_blank" rel="nofollow noopener noreferrer"&gt;Unsplash&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;</description><category>AI</category><category>CVE-2026-25253</category><category>Multi-Agent</category><category>OpenClaw</category><category>Operations</category><category>Security</category><guid>https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/</guid><pubDate>Sun, 22 Feb 2026 12:00:00 GMT</pubDate></item><item><title>I Run Dozens of Projects with AI. The Hard Part Isn't the AI.</title><link>https://anothercoffee.net/run-dozens-of-projects-with-ai/</link><dc:creator>Anthony Lopez-Vito</dc:creator><description>&lt;figure&gt;&lt;img src="https://anothercoffee.net/images/posts/run-dozens-of-projects-ai-og-1200x630.jpg"&gt;&lt;/figure&gt; &lt;p&gt;I run a micro-agency which, by nature, is always resource constrained. We purposely stay small and nimble to adjust quickly to changes in a project, or the industry as a whole, and that means we can't take on lots of team members. I've written about our &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;twenty-year journey&lt;/a&gt; elsewhere but the upshot is that I end up doing a lot of the work myself.&lt;/p&gt;
&lt;p&gt;Staying small and specialised is manageable with a handful of projects but when you're running dozens for multiple clients, the problem becomes keeping track of the context. What's new on this project? What are the priorities on that one? What are the specific requirements for this particular client? Tracking all of that detail, even when using a project management tool like Teamwork or Basecamp, across a bunch of projects doesn't scale.&lt;/p&gt;
&lt;p&gt;This article is about how I fixed this problem of project context. The fix itself started off as a simple improvement for one client, but it grew organically, and ended up completely changing the way I use a computer and manage my projects. I've started calling the result an &lt;a href="https://anothercoffee.net/categories/aoe/"&gt;Agentic Operating Environment&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id="too-busy-for-new-toys"&gt;Too Busy for New Toys&lt;/h3&gt;
&lt;p&gt;ChatGPT was released on 30 November 2022 and while I followed developments with interest, I was too busy to be an early adopter. When you run a small operation, you don't have the luxury of playing with new tools just because they're novel or fun. You need to know something works before you invest time in it because paying clients demand your attention.&lt;/p&gt;
&lt;p&gt;However, by mid-2023, ChatGPT had gone mainstream enough that it was time I gave it a spin in earnest, so I suggested to a client that we try it on a project. The results were astounding. We completed months of work in a few weeks and saved significantly on hiring specialist consultants.&lt;/p&gt;
&lt;p&gt;The potential was immediately obvious. One person could now do the work of several, deliver higher quality output, and do it faster. For a small outfit, that's a significant edge.&lt;/p&gt;
&lt;h3 id="trapped-in-the-browser"&gt;Trapped in the Browser&lt;/h3&gt;
&lt;p&gt;I started with the web interfaces everyone else was using, ChatGPT, Claude, and primarily &lt;a href="https://anothercoffee.net/why-we-keep-using-chatllm/"&gt;ChatLLM&lt;/a&gt; which gave me access to multiple models at a fraction of the cost. They were genuinely useful, but two problems kept getting worse the more I relied on them.&lt;/p&gt;
&lt;p&gt;First, there was no real continuity between sessions, and even after the Projects feature rolled out, keeping the uploaded context updated meant manually preparing and re-uploading documents every time something changed.&lt;/p&gt;
&lt;p&gt;Second, the work was trapped inside the browser. AI could help me think through a problem or draft a document, but getting that output into my actual project files meant tedious copy-paste. The download and export features on these platforms were unreliable. Many times ChatGPT or ChatLLM would offer me a download link and the file would be empty or not what was expected. I needed a way to break out of the browser.&lt;/p&gt;
&lt;h3 id="breaking-out"&gt;Breaking Out&lt;/h3&gt;
&lt;p&gt;Claude Code was the breakthrough. Instead of talking to an AI in a browser window, I had an agent that worked directly inside my file system's project directories. It could read files, run commands, and do real work on my local machine.&lt;/p&gt;
&lt;p&gt;Then there were the context files, giving the agent project background and instructions. Suddenly we had an AI that could pick up where the last session left off so there was no more re-explaining, no more starting from scratch.&lt;/p&gt;
&lt;p&gt;The real shift was realising what "runs commands on your machine" actually meant. Yes, this was a coding tool designed for software developers to write and debug code. But it could run any terminal command, which meant it could configure my local environment, install packages, and manage services. And if it could do that locally, it could SSH into a remote server and do the same thing there. This was more than a coding tool. It was a general-purpose operator that could do anything I could reach from a terminal. That's when it became the foundation for everything else.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;That's when it became the foundation for everything else.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;Claude Code obviously wasn't just for coding. Systems administration, project management, writing, research, document generation, anything that benefits from knowing a project's context could now be handled by an AI. The tool was designed for developers but the principle applied to everything I do day-to-day.&lt;/p&gt;
&lt;p&gt;That opened a bigger question. If an agent can read a context file, what else could you put in one? Could you give it genuine working memory, like what happened last session, what's stuck, what another project needs from this one?&lt;/p&gt;
&lt;h3 id="one-client-many-agents"&gt;One Client, Many Agents&lt;/h3&gt;
&lt;p&gt;I started with one client who had particularly complex needs. Instead of one general-purpose AI assistant, I created specialised agents as separate Claude Code projects, each handling a different area of the client's work:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Systems administration&lt;/li&gt;
&lt;li&gt;Web development&lt;/li&gt;
&lt;li&gt;Content curation&lt;/li&gt;
&lt;li&gt;Project management&lt;/li&gt;
&lt;li&gt;Requirements analysis&lt;/li&gt;
&lt;li&gt;Research&lt;/li&gt;
&lt;li&gt;Document generation&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each agent had its own context, its own working memory, and its own domain expertise. The difference was immediate. Switching between projects no longer meant rebuilding my mental model of where each one stood. For someone juggling many fast-moving and slow-moving projects at the same time, that friction adds up.&lt;/p&gt;
&lt;p&gt;Then I deployed the same approach on a second client, one that happened to have team members who overlapped with the first. This is where the cognitive burden genuinely lifted because the agents remembered all the details. Who works on what, what the specific requirements are, what happened last session and the session before that. Each agent would know at the start of every session what I used to carry around in my head, across two separate clients, without mixing anything up.&lt;/p&gt;
&lt;p&gt;I stopped being the person who has to remember everything because the agents did that now.&lt;/p&gt;
&lt;h3 id="teaching-agents-to-talk"&gt;Teaching Agents to Talk&lt;/h3&gt;
&lt;p&gt;The agents needed to communicate though. An infrastructure change in one project might affect the web development project, or a content decision might depend on input from project management. Plus, these agents weren't all running on the same AI vendor.&lt;/p&gt;
&lt;p&gt;I was already using different models for different strengths. Some are better at careful, structured reasoning; others are faster and cheaper for routine tasks; some handle large volumes of context well; others are stronger at creative work or code generation. Picking the right model for each job made sense, but the agents lived on different platforms and had no native way to coordinate.&lt;/p&gt;
&lt;p&gt;My first attempt at solving this let agents edit files directly in other projects. That soon broke when one agent tidied up files another agent needed, but the fix was obvious. My university research was in multiprocessor computing, and this is a solved problem. Also, Linux processes don't scribble in each other's memory; they communicate through pipes and message queues. Same principle, different scale. Instead of agents editing each other's files, they send structured messages, or &lt;em&gt;memos&lt;/em&gt;. Each memo is a plain-text markdown file dropped into the receiving project's directory, and the receiving project picks them up at its next session. Because the memos are plain text files, they work across any AI tool that can read files. A Claude agent sends a memo, then a ChatGPT agent picks it up next session. The vendor boundary becomes invisible.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;A Claude agent sends a memo, then a ChatGPT agent picks it up next session.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;This was a bigger deal than it might sound because it meant I could pick the best model for each job without worrying about whether the agents could coordinate afterwards. Cross-project and cross-vendor communication, solved in one move with plain text and file conventions.&lt;/p&gt;
&lt;h3 id="working-memory"&gt;Working Memory&lt;/h3&gt;
&lt;p&gt;The memo system solved communication, but agents also needed memory that persisted between sessions.&lt;/p&gt;
&lt;p&gt;My first version was a single state file with no size cap. It grew quickly and before long the agent was spending its limited context window reading history it didn't need. Still, the fix was straightforward. A project brief captures confirmed knowledge that changes slowly, like what the project is, who's involved, and what's been decided. A state file tracks what's happening right now and what needs to happen next. An archive holds older progress that isn't immediately relevant. A separate file captures reusable patterns and gotchas so the agent doesn't repeat the same mistakes. Git sits underneath everything, so nothing is ever truly lost and you can always trace what happened and when.&lt;/p&gt;
&lt;p&gt;Each layer serves a different purpose. The conventions keep them from bleeding into each other.&lt;/p&gt;
&lt;h3 id="an-agent-that-builds-agents"&gt;An Agent That Builds Agents&lt;/h3&gt;
&lt;p&gt;By this point I had a system that worked. I had specialised agents with context files, state management, memo-based communication, vendor-neutral conventions. However, setting up a new project meant creating the right files, writing the context, establishing all the conventions. Repetitive work.&lt;/p&gt;
&lt;p&gt;So I created what I think of as a factory agent. An AI agent whose job is to create other AI agents. Give it a project brief and it scaffolds everything like context files, state tracking, memo conventions, project-specific instructions.&lt;/p&gt;
&lt;p&gt;Next was a manager agent that sits above the project agents and keeps the high-level view. It knows what's happening across all projects without holding the details of any single one. Each project agent tracks its own domain in depth but the manager tracks the big picture and coordinates between them.&lt;/p&gt;
&lt;p&gt;Three layers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a factory that creates agents;&lt;/li&gt;
&lt;li&gt;a manager that coordinates them;&lt;/li&gt;
&lt;li&gt;and project agents that do the actual work.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I started moving all my projects onto this system and jumped from two to dozens in the space of a few weeks.&lt;/p&gt;
&lt;h3 id="the-boring-bits-that-matter"&gt;The Boring Bits That Matter&lt;/h3&gt;
&lt;p&gt;The system works. But the problems that come with running AI agents on real client work are unglamorous and unavoidable.&lt;/p&gt;
&lt;p&gt;I was reviewing an agent's work and noticed it had been claiming tasks as complete without actually doing the work. It reported everything as fine but when I challenged it, the agent admitted taking shortcuts. A convention was needed to require explicit state updates, so I put in place a task checklist system. It's not a trust issue. The models just need clear structures to follow. That's a recurring theme with this work: better conventions lead to better results.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;The models just need clear structures to follow.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;I also learned which tasks justify the most capable model. Strategic work, writing, and complex decisions benefit from the best model available, but routine file scaffolding and handover updates don't need it. The difference shows in speed. A lighter model handles simple tasks in seconds where a heavier one takes noticeably longer, and if you're paying per use rather than a flat plan, the cost adds up too. It sounds like a small thing until you're running several sessions in parallel and a heavier model is still churning through a handover update while you're waiting to start the next project.&lt;/p&gt;
&lt;p&gt;Then there are the invisible failures, the ones that don't produce error messages. An agent confident it's working in the right project directory turns out to be somewhere else entirely, and without conventions this happens a lot. Agents that archive messages before completing the actions in them. Getting them to follow the conventions reliably is one of the harder problems, and it's still ongoing. Over twenty documented patterns now, each one born from something going wrong in real client work.&lt;/p&gt;
&lt;h3 id="why-better-ai-wont-fix-this"&gt;Why Better AI Won't Fix This&lt;/h3&gt;
&lt;p&gt;If you're running multiple projects across multiple clients, a smarter model doesn't help much when it still forgets everything between sessions. The AI is already good enough but the coordination problem stays the same. What's missing for most people is the infrastructure and conventions that make it consistently useful across real work.&lt;/p&gt;
&lt;p&gt;Most small businesses are still figuring out where AI fits. If you're starting to rely on it across multiple projects, the ceiling comes quickly. There's too much context to carry manually, too many sessions starting from scratch, and you end up being the human messenger between tools that can't talk to each other. The temptation is to stitch together automation workflows to fix this, but that's plumbing, not infrastructure. It breaks when any component changes. Building on the vendors' own tools means they maintain the foundation. You just maintain the conventions and customise them for your needs.&lt;/p&gt;
&lt;p&gt;The hard part was never the AI but the conventions, the memory management, the communication protocols. Boring problems that determine whether AI actually works across real projects or just impresses you in a single session. I've solved it for my own business and I build this infrastructure for others. If any of this sounds familiar, &lt;a href="https://anothercoffee.net/about/"&gt;I'm happy to talk through it&lt;/a&gt;.&lt;/p&gt;
&lt;section class="mt-4 pt-4"&gt;
&lt;h3 class="text-center pb-4"&gt;Common Questions&lt;/h3&gt;
&lt;div class="container border bg-light p-4"&gt;
&lt;p&gt;&lt;strong&gt;Do I need to be technical to set this up?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To operate it, no. The conventions are plain text files that AI agents read and update. You don't write code or manage infrastructure day to day. Setting the system up does require technical knowledge (which is where I come in), but once it's running, working within it doesn't.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What AI tools does this work with?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Any tool that can read files in a project directory, which is most of the capable ones now. I currently use four different vendor families. Because the conventions are plain text with no vendor-specific dependencies, new tools slot in without changes. Given how quickly things shift in this space, that flexibility has already proved its worth several times over.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What happens if I stop using AI agents?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Everything is in markdown files. Your project state, history, issues, and accumulated knowledge are all human-readable and useful whether or not you're using AI. You'd walk away with better project documentation than most businesses have, which isn't a bad outcome regardless.&lt;/p&gt;
&lt;/div&gt;
&lt;/section&gt;

&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;This article is part of &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;an ongoing series&lt;/a&gt; on how Another Cup of Coffee is adapting to AI. &lt;a href="https://anothercoffee.net/categories/ai/"&gt;Explore all articles in this series&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;div class="mt-5"&gt;
    &lt;h3&gt;You may also like&lt;/h3&gt;

    &lt;div class="row"&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/building-an-operating-environment-for-ai-agents/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/aoe-howibuild-card-300x150.jpg" class="card-img-top" alt="Building an Operating Environment for AI Agents"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/building-an-operating-environment-for-ai-agents/" class="listtitle"&gt;Building an Operating Environment for AI Agents&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2025-05-15T13:20:00Z" title="15 May 2025"&gt;15 May 2025&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;How markdown files and conventions turned CLI agent tools into a coordination system running 44 projects across 14 organisations. No framework required.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;
                        &lt;img src="https://anothercoffee.net/images/a-west-london-micro-agencys-journey-to-ai-featured.jpg" class="card-img-top" alt="Coffee and a laptop with ChatGPT"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/" class="listtitle"&gt;Still Alive: A Micro Agency's 20 Year Journey&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2024-10-15T15:28:15Z" title="15 October 2024"&gt;15 October 2024&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;This article will be the first in a series where I'll share how Artificial Intelligence has reshaped how we operate at Another Cup of Coffee.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/openclaw-security-card-300x150.jpg" class="card-img-top" alt="Red lobster on a white plate"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/what-openclaw-teaches-us-about-ai-agent-security/" class="listtitle"&gt;What OpenClaw Teaches Us About AI Agent Security&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2026-02-22T12:00:00Z" title="22 February 2026"&gt;22 February 2026&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;OpenClaw's security crisis exposed real problems with how AI agents handle credentials, plugins, and system access. Here's what went wrong and how a convention-based approach avoids these risks entirely.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;div class="mt-4 pt-4 text-muted small border-top border-bottom"&gt;
    &lt;h3 class="text-muted small"&gt;Footnotes&lt;/h3&gt;
    &lt;ul&gt;
      &lt;li&gt;Featured image photo by &lt;a href="https://unsplash.com/@mattgyver?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Matt Benson&lt;/a&gt; on &lt;a href="https://unsplash.com/photos/a-couple-of-men-standing-next-to-each-other-qlaj76CocqY?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Unsplash&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
&lt;/div&gt;</description><category>Agency</category><category>AI</category><category>AOE</category><category>Business</category><category>Multi-Agent</category><category>Workflow</category><guid>https://anothercoffee.net/run-dozens-of-projects-with-ai/</guid><pubDate>Sat, 20 Dec 2025 12:00:00 GMT</pubDate></item><item><title>Building an Operating Environment for AI Agents</title><link>https://anothercoffee.net/building-an-operating-environment-for-ai-agents/</link><dc:creator>Anthony Lopez-Vito</dc:creator><description>&lt;figure&gt;&lt;img src="https://anothercoffee.net/images/posts/aoe-howibuild-og-1200x630.jpg"&gt;&lt;/figure&gt; &lt;p&gt;Over the past twenty years I've managed server infrastructure, built WordPress sites, handled data migrations and put together architecture specs for clients across multiple organisations. If you juggle multiple clients you'll know the feeling. Every project has its own context, its own history, its own set of things that need to happen next. Even when you use some sort of project management tool, a lot of the details still need to live in your head. Furthermore, every time you switch between projects, you need to reframe your mindspace to recall where a particular project is in that particular moment in time.&lt;/p&gt;
&lt;p&gt;I've now unexpectedly solved this problem by building a coordination system that lets AI agents work across projects, clients, and organisations. It wasn't something I set out to build and instead, it just sort of happened organically as I began incorporating agentic coding tools into my everyday workflow. I've started calling it an Agentic Operating Environment, or &lt;a href="https://anothercoffee.net/categories/aoe/"&gt;AOE&lt;/a&gt;. Essentially, it's a set of file-based conventions that give AI agents persistent memory and a way to coordinate across projects and vendors.&lt;/p&gt;
&lt;h3 id="the-problem-that-crept-up-on-me"&gt;The Problem That Crept Up on Me&lt;/h3&gt;
&lt;p&gt;I was &lt;a href="https://anothercoffee.net/why-we-keep-using-chatllm/"&gt;already using AI seriously&lt;/a&gt; across my projects for a range of different tasks like writing code, research and troubleshooting, drafting client documentation, and processing data. The problem was continuity. Every new conversation started from scratch and I'd spend the first ten minutes of every session re-explaining what the problem or project was about, what had been done, and what needed to happen next. I'd have to prepare and manually upload files to give context. Switch to a different AI tool and the whole thing started from zero. This is fine for a one-off task, but it was a real hassle for working on a dozen active projects across multiple clients.&lt;/p&gt;
&lt;p&gt;Then context files appeared with &lt;a href="https://docs.anthropic.com/en/docs/claude-code/overview"&gt;Claude Code&lt;/a&gt; introducing a file the agent reads at the start of every session, giving it project background and instructions. Other tools followed with their own versions and suddenly agents could pick up where the last one left off. I remember thinking this changes everything.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;Most people were writing code with these tools but I was now starting to run my business through them.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;Most people were writing code with these tools but I was now starting to run my business through them. For that purpose, knowing about one project at a time wasn't nearly enough. I needed agents that understood what was happening across projects: what's blocked, who's waiting on what, which client deliverable depends on which internal task finishing first.&lt;/p&gt;
&lt;p&gt;This problem opened up a bigger question. If an agent can read a context file, what else could you put in it? Could you give it genuine working memory, like what happened last session, what's blocked, what another project needs from this one? Also, what if one agent's work needed to inform a different project? Could you pass that context along without me being the human messenger as was needed in the web versions?&lt;/p&gt;
&lt;p&gt;What I ended up with is a few things that work together::&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Working memory and project context&lt;/strong&gt; so agents know what happened last time and     &lt;br&gt;
understand the project                        &lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A memo system&lt;/strong&gt; so they can talk across projects                                      &lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Vendor-neutral conventions&lt;/strong&gt; so I'm not locked to one AI provider&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Specialised agents&lt;/strong&gt; each handling a different area of work&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Utility agents&lt;/strong&gt; that manage the environment itself: a coordinator for cross-project
visibility, a sysadmin for machine configuration, a configurator that scaffolds new
projects&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A factory&lt;/strong&gt; that deploys new agent environments from templates&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The individual pieces aren't complicated. Getting them to work together reliably across dozens of projects is where the time went, and what emerged is an architecture that looks fundamentally different from how most people structure their AI assistants.&lt;/p&gt;
&lt;h3 id="it-doesnt-look-like-much"&gt;It Doesn't Look Like Much&lt;/h3&gt;
&lt;figure class="figure d-flex flex-column align-items-center"&gt;
&lt;img src="https://anothercoffee.net/images/posts/aoe-new-deployment.png" alt="New AOE deployment for a website manager agent" title="Screenshot of New AOE deployment" class="img-fluid" style="max-width: 100%; height: auto;"&gt;
&lt;figcaption class="figure-caption text-center mt-2"&gt;New AOE deployment for our website manager&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;It doesn't look revolutionary. It's a terminal interface that's been used since the earliest days of computing. But what's happening in that conversation is that I described what a project needed in plain English and an agent built the entire working environment: context files, state tracking, communication channels, vendor-specific instructions. No commands to remember, no forms to fill in, no setup wizard, just a description of the outcome I wanted.&lt;/p&gt;
&lt;p&gt;This is what I mean by "operating environment". It's not an operating system that manages hardware and device drivers, but something that sits above all of that. Where your OS gives you a graphical or command line interface to the computer (windows, menus, mouse clicks, command parameters to remember), this gives you a natural language one. You just describe what you need in plain English and an agent works out how to get it done.&lt;/p&gt;
&lt;p&gt;What surprised me is how little custom code this requires. Tools like Claude Code, Codex, and Gemini CLI already ship with everything you need to get started. They read and write files, run shell commands, search codebases, and follow instructions from context files. I'm not stitching together third-party tools and hoping they still work next month. Instead, the foundation rests on core features that the vendors build and maintain.&lt;/p&gt;
&lt;p&gt;The hard part is more the conventions than the tooling. For example, what goes in each file, how to draw boundaries between projects, how to keep agents following the rules session after session. That's where the time went.&lt;/p&gt;
&lt;p&gt;This started off as a simple improvement for one client but has now completely changed the way I use a computer and manage my projects.&lt;/p&gt;
&lt;h3 id="working-memory"&gt;Working Memory&lt;/h3&gt;
&lt;p&gt;Every project gets a handful of markdown files that serve as the agent's working memory. A project brief (&lt;code&gt;BRIEF.md&lt;/code&gt;) captures confirmed knowledge that changes slowly like what the project is, who's involved, what's been decided about scope and technical direction. A state file (&lt;code&gt;STATE.md&lt;/code&gt;) tracks what's happening right now and what needs to happen next. An archive (&lt;code&gt;HISTORY.md&lt;/code&gt;) holds older progress that isn't immediately relevant. Other files capture reusable patterns (&lt;code&gt;INSIGHTS.md&lt;/code&gt;) and whatever domain context the agent needs. The whole lot sits in a git repository, so nothing falls through the cracks and every change is auditable.&lt;/p&gt;
&lt;p&gt;Here's what a typical project looks like on disk:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;project-root/
├── BRIEF.md           # Confirmed project knowledge
├── CLAUDE.md          # Agent instructions
├── STATE.md           # Working memory
├── HISTORY.md         # Archived progress
├── INSIGHTS.md        # Reusable learnings
├── memos/
│   ├── incoming/      # Unprocessed messages
│   └── archive/       # Completed messages
└── docs/              # Project-specific reference
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;If any of that sounds abstract, the files map to things you already know:&lt;/p&gt;
&lt;table class="table table-bordered mt-4 mb-4"&gt;                                            
&lt;thead&gt;                                       
&lt;tr&gt;&lt;th&gt;File&lt;/th&gt;&lt;th&gt;Purpose&lt;/th&gt;&lt;/tr&gt;                                                    
&lt;/thead&gt;                                      
&lt;tbody&gt;                                                                                   
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;BRIEF.md&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;What the project is, who's involved, key
decisions&lt;/td&gt;&lt;/tr&gt;                    
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;STATE.md&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;What's happening now, what's next&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;HISTORY.md&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Archived progress&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;INSIGHTS.md&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Reusable patterns and gotchas&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;CLAUDE.md&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;How the agent should behave&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;It works in layers, with the agent reading STATE.md at the start of a session, doing its work, and updating it when it's done. Here's what a real one looks like. It's anonymised, but the structure is genuine.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="gu"&gt;## Quick Reference&lt;/span&gt;

| Key | Value |
|-----|-------|
| Project Path | ~/Projects/ClientA/web-dev |
| Last Updated | 2025-09-28 |
| Last Agent | Claude Code |
| Current Focus | Header component refactor |

&lt;span class="gu"&gt;## Current Status&lt;/span&gt;

Theme migration 80% complete. Navigation and footer done,
header still needs mobile breakpoints. Blocked on logo
asset from client.

&lt;span class="gu"&gt;## Recent Progress&lt;/span&gt;

&lt;span class="gu"&gt;### 2025-09-28&lt;/span&gt;
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Completed footer template with accessibility fixes
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Agent: Claude Code

&lt;span class="gu"&gt;### 2025-09-26&lt;/span&gt;
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Migrated navigation patterns from legacy theme
&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Agent: Gemini CLI

&lt;span class="gu"&gt;## Next Actions&lt;/span&gt;

&lt;span class="k"&gt;1.&lt;/span&gt; [x] Migrate navigation patterns
&lt;span class="k"&gt;2.&lt;/span&gt; [x] Rebuild footer template
&lt;span class="k"&gt;3.&lt;/span&gt; [ ] Refactor header component (blocked: logo asset)
&lt;span class="k"&gt;4.&lt;/span&gt; [ ] Cross-browser testing

&lt;span class="gu"&gt;## Handover Notes&lt;/span&gt;

Header refactor is ready to go once the client sends the
logo. SVG preferred, PNG fallback. The mobile nav uses a
slide-out pattern, not a dropdown. See docs/nav-spec.md.
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Every new session picks up exactly where the last one left off. The agent knows what's done, what's blocked, and why. Older entries get archived periodically to keep the working file lean.&lt;/p&gt;
&lt;h3 id="pick-the-right-tool-for-the-job"&gt;Pick the Right Tool for the Job&lt;/h3&gt;
&lt;p&gt;I use Claude Code, Codex, Gemini CLI, and DeepAgent across different projects, and each has strengths for different types of work (and some genuinely maddening blind spots, but that's a longer conversation). The system doesn't care which one you pick.&lt;/p&gt;
&lt;p&gt;Each project carries vendor-specific instruction files like CLAUDE.md for Claude Code, AGENTS.md for Codex, GEMINI.md for Gemini CLI. DeepAgent handles its configuration through its own rules system rather than a markdown file. These files tell the agent how to work in this particular project, covering things like session protocols, coding standards, and how to handle state updates. The content differs per vendor because each tool loads instructions differently, but the conventions they enforce are identical. If I want to switch providers for a project, nothing breaks.&lt;/p&gt;
&lt;p&gt;What those instruction files actually enforce is the same core loop. Read STATE.md and check for incoming memos, extract your action list, do the work, update STATE.md with progress and handover notes, commit. Every agent follows this regardless of vendor. The instruction files also cover git conventions (like always use the &lt;code&gt;-C&lt;/code&gt; flag, never commit to another project's repo) and cross-project communication rules (send memos, don't edit other projects' files). The format is vendor-specific but the behaviour is nearly identical.&lt;/p&gt;
&lt;p&gt;This is a deliberate choice as the AI vendor market is volatile, with pricing changes, shifting capabilities shifts, and new tools appearing constantly. Last year's best option might not be this year's, and if your operation depends on one provider's way of doing things, a pricing change or service outage becomes a showstopper.&lt;/p&gt;
&lt;figure class="figure d-flex flex-column align-items-center"&gt;
&lt;img src="https://anothercoffee.net/images/posts/claude-api-error-500.png" alt="Claude API Error 500" title="Screenshot of Claude API Error 500" class="img-fluid" style="max-width: 100%; height: auto;"&gt;
&lt;figcaption class="figure-caption text-center mt-2"&gt;Claude API errors? Launch another vendor's model and keep working.&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;With vendor-neutral conventions, I simply run another vendor's agent and the project keeps running. It's a standard part of my workflow. Also, because the conventions are just files, there's nothing stopping you from running local models if your hardware is powerful enough. The system doesn't care whether the AI is in the cloud or on your own machine.&lt;/p&gt;
&lt;h3 id="cross-project-messaging"&gt;Cross-project Messaging&lt;/h3&gt;
&lt;p&gt;The agents needed to communicate across projects just as human team members do. For example, an infrastructure change in one might affect web development in another, or a content decision might depend on input from project management. However, the agents lived in separate project directories with no native way to talk to each other.&lt;/p&gt;
&lt;p&gt;Initially I would let agents edit files directly in other projects. This was obviously not workable very early on when an agent tidied up what it thought were stale files that another agent needed. The fix was obvious as my university research was in multiprocessor computing, and this is a solved problem. Also, Linux processes don't scribble in each other's memory; they communicate through pipes and message queues. Early bulletin board systems worked the same way: post a message to a board, the recipient reads it when they next connect. The principle hasn't changed in decades: send a message, let the receiver deal with it in its own time.&lt;/p&gt;
&lt;p&gt;Same principle here. Instead of agents editing each other's files, they send structured messages (I call them &lt;em&gt;memos&lt;/em&gt;) into each other's &lt;code&gt;memos/incoming/&lt;/code&gt; directories. The receiving project picks them up at its next session then archives any that have the necessary actions completed. Because the memos are plain text files with naming conventions, they work across any AI tool that can read files. A Claude agent sends a memo, a Gemini agent picks it up, and the vendor boundary becomes invisible.&lt;/p&gt;
&lt;figure class="figure d-flex flex-column align-items-center"&gt;
&lt;img src="https://anothercoffee.net/images/posts/aoe-code-error-01.png" alt="Agents sending memos" title="Screenshot of Agents sending memos" class="img-fluid" style="max-width: 100%; height: auto;"&gt;
&lt;figcaption class="figure-caption text-center mt-2"&gt;A sysadmin agent sends a bug to the project's developer agent.&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;The files are named &lt;code&gt;MEMO-&amp;lt;topic&amp;gt;.md&lt;/code&gt; and follow a standard format. Here's one (anonymised) where a web development agent is notifying an infrastructure agent that a deployment is ready:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="gh"&gt;# Memo: Staging Deployment Ready&lt;/span&gt;

&lt;span class="gs"&gt;**From:**&lt;/span&gt; web-dev @ ~/Projects/ClientA/web-dev
&lt;span class="gs"&gt;**To:**&lt;/span&gt; sysadmin @ ~/Projects/ClientA/sysadmin
&lt;span class="gs"&gt;**Date:**&lt;/span&gt; 2026-02-28
&lt;span class="gs"&gt;**Subject:**&lt;/span&gt; Theme migration ready for staging

---

&lt;span class="gu"&gt;## Purpose&lt;/span&gt;

Header refactor is complete. Ready for staging deployment.

&lt;span class="gu"&gt;## Content&lt;/span&gt;

All templates migrated and tested locally. No new
dependencies. The only change to server config is the
new image optimisation path in the nginx rules
(documented in docs/nginx-changes.md).

&lt;span class="gu"&gt;## Action Required&lt;/span&gt;

&lt;span class="k"&gt;- [ ]&lt;/span&gt; Deploy to staging environment
&lt;span class="k"&gt;- [ ]&lt;/span&gt; Verify nginx config change
&lt;span class="k"&gt;- [ ]&lt;/span&gt; Run smoke tests and report back

---

&lt;span class="gu"&gt;## Completion Notes&lt;/span&gt;
&amp;lt;!-- Receiving agent: complete this section before archiving --&amp;gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The receiving agent reads the memo at its next session, works through the checkboxes, fills in the completion notes, and archives it. The checkboxes are the accountability mechanism and a memo can't be archived until every action is ticked off. At least, that's how it's supposed to work; in practice, agents occasionally skip steps or archive memos prematurely. Getting them to follow the conventions reliably is one of the harder problems I've had to solve.&lt;/p&gt;
&lt;h3 id="putting-agents-to-work"&gt;Putting Agents to Work&lt;/h3&gt;
&lt;p&gt;I started with one client who had particularly complex needs. Instead of one general-purpose AI assistant, I created specialised agents, each handling a different area like systems administration, web development, content curation, project management, research, document generation. Each agent has its own working memory and domain expertise.&lt;/p&gt;
&lt;p&gt;Then I deployed the same approach on a second client, and the cognitive burden genuinely lifted. The agents remembered every detail I used to carry around in my head, across two separate clients, without mixing anything up. I stopped being the person who has to remember everything.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;I stopped being the person who has to remember everything.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;Sitting above the project agents is a coordinator which is a dedicated agent whose job is cross-project visibility. It reads the state of every project (44 at last count, across 14 organisations), tracks what's blocked, routes memos between projects that don't know each other's paths, and maintains a dashboard of the whole operation.&lt;/p&gt;
&lt;p&gt;The way it knows what's happening is straightforward because each project is registered in a YAML file:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;client-a-web&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;~/Projects/ClientA/web-dev&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;organization&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;client-a&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;WordPress theme development and maintenance&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;web&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;wordpress&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;dependencies&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;client-a-infra&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;registered&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;2026-01-15&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;A scanner reads STATE.md from every registered project and builds a view of what's active, what's blocked, and what's gone quiet. The mechanism is simple but the visibility it gives you is automatic. No one has to update a dashboard or fill in a status report as the agents' own working files are the source of truth.&lt;/p&gt;
&lt;h3 id="an-agent-that-builds-agents"&gt;An Agent That Builds Agents&lt;/h3&gt;
&lt;p&gt;When the conventions started working reliably, I wanted a way to replicate them without manually setting up every new project, so I built what amounts to a factory agent. I give the agent a project brief and it scaffolds everything in one step. The output is a complete project directory including:
- instruction files for each AI vendor, with session protocols, git conventions, and cross-project communication rules baked in;
- a &lt;code&gt;STATE.md&lt;/code&gt; with the Quick Reference table and empty sections ready to fill;
- a memos directory;
- its own &lt;code&gt;README&lt;/code&gt;, &lt;code&gt;HISTORY.md&lt;/code&gt;, &lt;code&gt;INSIGHTS.md&lt;/code&gt;, and whatever domain-specific extras the project type calls for.&lt;/p&gt;
&lt;p&gt;Different project types get different scaffolds. A sysadmin project gets server inventory sections and SSH configuration templates. A bookkeeping project gets invoices and reconciliation directories. The conventions are the same across all of them but the domain-specific bits vary. I can point the factory at an existing working project and say "set up a new one like this, but for Client B." It uses the existing project as a reference and adjusts for the new context.&lt;/p&gt;
&lt;p&gt;The factory has deployed over forty projects so far, each one inheriting the same coordination conventions. A new project goes from nothing to a working agent environment in one step.&lt;/p&gt;
&lt;h3 id="star-vs-mesh"&gt;Star vs Mesh&lt;/h3&gt;
&lt;p&gt;Now that you've seen all the pieces, it's worth stepping back to look at the structure of what emerged because it's not the shape most people build.&lt;/p&gt;
&lt;p&gt;Most AI assistant frameworks use a hub-and-spoke topology. &lt;a href="https://danielmiessler.com/blog/personal-ai-infrastructure" target="_blank" rel="noopener"&gt;Daniel Miessler's PAI&lt;/a&gt; is probably the best-known example with one central assistant, one identity, one memory system, and skill modules for different domains. &lt;a href="https://www.youtube.com/watch?v=aYAVSG4Ra40" target="_blank" rel="noopener"&gt;Kenny Liao's&lt;/a&gt; personal assistant follows a similar pattern, with domain-specific plugins loaded into a single runtime. The skills are more sophisticated than they look. PAI's chain into each other, and Liao's use progressive context loading. But the structural principle is the same: one hub and everything flows through it.&lt;/p&gt;
&lt;p&gt;&lt;svg viewbox="0 0 460 270" xmlns="http://www.w3.org/2000/svg" role="img" aria-label="Star topology: a single assistant hub with skill modules branching off it" style="max-width: 460px; margin: 1.5em auto; display: block; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;"&gt;
  &lt;defs&gt;
    &lt;marker id="star-arrow" viewbox="0 0 10 8" refx="10" refy="4" markerwidth="8" markerheight="6" orient="auto"&gt;
      &lt;path d="M0,0 L10,4 L0,8 Z" fill="#aaa"&gt;&lt;/path&gt;
    &lt;/marker&gt;
  &lt;/defs&gt;
  &lt;rect x="185" y="12" width="90" height="36" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="230" y="34" text-anchor="middle" font-size="14" fill="#333"&gt;You&lt;/text&gt;
  &lt;line x1="230" y1="48" x2="230" y2="82" stroke="#aaa" stroke-width="1.5" marker-end="url(#star-arrow)"&gt;&lt;/line&gt;
  &lt;rect x="140" y="88" width="180" height="56" rx="5" fill="#fff5f5" stroke="#d4576b" stroke-width="2"&gt;&lt;/rect&gt;
  &lt;text x="230" y="111" text-anchor="middle" font-size="14" fill="#333"&gt;Single Assistant&lt;/text&gt;
  &lt;text x="230" y="130" text-anchor="middle" font-size="12" fill="#888"&gt;(PAI)&lt;/text&gt;
  &lt;line x1="185" y1="144" x2="105" y2="198" stroke="#aaa" stroke-width="1.5" marker-end="url(#star-arrow)"&gt;&lt;/line&gt;
  &lt;line x1="230" y1="144" x2="230" y2="198" stroke="#aaa" stroke-width="1.5" marker-end="url(#star-arrow)"&gt;&lt;/line&gt;
  &lt;line x1="275" y1="144" x2="355" y2="198" stroke="#aaa" stroke-width="1.5" marker-end="url(#star-arrow)"&gt;&lt;/line&gt;
  &lt;rect x="25" y="204" width="120" height="46" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="85" y="222" text-anchor="middle" font-size="11" fill="#888"&gt;Skill:&lt;/text&gt;
  &lt;text x="85" y="238" text-anchor="middle" font-size="14" fill="#333"&gt;Health&lt;/text&gt;
  &lt;rect x="170" y="204" width="120" height="46" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="230" y="222" text-anchor="middle" font-size="11" fill="#888"&gt;Skill:&lt;/text&gt;
  &lt;text x="230" y="238" text-anchor="middle" font-size="14" fill="#333"&gt;Finance&lt;/text&gt;
  &lt;rect x="315" y="204" width="120" height="46" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="375" y="222" text-anchor="middle" font-size="11" fill="#888"&gt;Skill:&lt;/text&gt;
  &lt;text x="375" y="238" text-anchor="middle" font-size="14" fill="#333"&gt;Writing&lt;/text&gt;
&lt;/svg&gt;&lt;/p&gt;
&lt;p&gt;My system uses a mesh topology. I first came across mesh architectures in the early 2000s while evaluating mesh networking technologies for a Japanese trading house. The concept isn't new, so applying it to AI agent coordination seemed like a natural fit.&lt;/p&gt;
&lt;p&gt;Each project is an autonomous node with its own state (&lt;code&gt;STATE.md&lt;/code&gt;, &lt;code&gt;HISTORY.md&lt;/code&gt;), its own instructions (&lt;code&gt;CLAUDE.md&lt;/code&gt;, &lt;code&gt;AGENTS.md&lt;/code&gt;, &lt;code&gt;GEMINI.md&lt;/code&gt;), and its own agent identity. No single node holds everything, and nothing requires them to sit on the same machine. Projects communicate asynchronously via memos, like Unix processes communicating through pipes.&lt;/p&gt;
&lt;p&gt;&lt;svg viewbox="0 0 480 310" xmlns="http://www.w3.org/2000/svg" role="img" aria-label="Mesh topology: autonomous project nodes communicating via memos" style="max-width: 480px; margin: 1.5em auto; display: block; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;"&gt;
  &lt;defs&gt;
    &lt;marker id="mesh-end" viewbox="0 0 10 8" refx="10" refy="4" markerwidth="8" markerheight="6" orient="auto"&gt;
      &lt;path d="M0,0 L10,4 L0,8 Z" fill="#aaa"&gt;&lt;/path&gt;
    &lt;/marker&gt;
    &lt;marker id="mesh-start" viewbox="0 0 10 8" refx="0" refy="4" markerwidth="8" markerheight="6" orient="auto-start-reverse"&gt;
      &lt;path d="M0,0 L10,4 L0,8 Z" fill="#aaa"&gt;&lt;/path&gt;
    &lt;/marker&gt;
  &lt;/defs&gt;
  &lt;rect x="20" y="15" width="150" height="44" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="95" y="42" text-anchor="middle" font-size="13" fill="#333"&gt;client-a-web&lt;/text&gt;
  &lt;rect x="310" y="15" width="150" height="44" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="385" y="42" text-anchor="middle" font-size="13" fill="#333"&gt;client-b-web&lt;/text&gt;
  &lt;rect x="20" y="135" width="170" height="44" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="105" y="162" text-anchor="middle" font-size="13" fill="#333"&gt;coordinator&lt;/text&gt;
  &lt;rect x="310" y="135" width="150" height="44" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="385" y="162" text-anchor="middle" font-size="13" fill="#333"&gt;client-b-infra&lt;/text&gt;
  &lt;rect x="20" y="255" width="170" height="44" rx="5" fill="#f5f5f5" stroke="#555" stroke-width="1.5"&gt;&lt;/rect&gt;
  &lt;text x="105" y="282" text-anchor="middle" font-size="13" fill="#333"&gt;shared-tools&lt;/text&gt;
  &lt;line x1="178" y1="37" x2="302" y2="37" stroke="#aaa" stroke-width="1.5" marker-start="url(#mesh-start)" marker-end="url(#mesh-end)"&gt;&lt;/line&gt;
  &lt;text x="240" y="30" text-anchor="middle" font-size="11" fill="#999"&gt;memo&lt;/text&gt;
  &lt;line x1="100" y1="67" x2="100" y2="127" stroke="#aaa" stroke-width="1.5" marker-start="url(#mesh-start)" marker-end="url(#mesh-end)"&gt;&lt;/line&gt;
  &lt;text x="78" y="97" text-anchor="middle" font-size="11" fill="#999"&gt;memo&lt;/text&gt;
  &lt;line x1="385" y1="67" x2="385" y2="127" stroke="#aaa" stroke-width="1.5" marker-start="url(#mesh-start)" marker-end="url(#mesh-end)"&gt;&lt;/line&gt;
  &lt;text x="405" y="97" text-anchor="middle" font-size="11" fill="#999"&gt;memo&lt;/text&gt;
  &lt;line x1="198" y1="157" x2="302" y2="157" stroke="#aaa" stroke-width="1.5" marker-start="url(#mesh-start)" marker-end="url(#mesh-end)"&gt;&lt;/line&gt;
  &lt;text x="250" y="150" text-anchor="middle" font-size="11" fill="#999"&gt;memo&lt;/text&gt;
  &lt;line x1="105" y1="187" x2="105" y2="247" stroke="#aaa" stroke-width="1.5" marker-start="url(#mesh-start)" marker-end="url(#mesh-end)"&gt;&lt;/line&gt;
  &lt;text x="83" y="217" text-anchor="middle" font-size="11" fill="#999"&gt;memo&lt;/text&gt;
&lt;/svg&gt;&lt;/p&gt;
&lt;table class="table table-bordered mt-4 mb-4"&gt;
&lt;thead&gt;
&lt;tr&gt;&lt;th&gt;&lt;/th&gt;&lt;th&gt;Star (PAI)&lt;/th&gt;&lt;th&gt;Mesh (AOE)&lt;/th&gt;&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Context&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;One context for everything&lt;/td&gt;&lt;td&gt;Scoped per project; agent only sees what it needs&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Boundaries&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;All domains share memory&lt;/td&gt;&lt;td&gt;Client A's data never enters Client B's context&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Failure&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Centre goes down, everything stops&lt;/td&gt;&lt;td&gt;One project breaks, others unaffected&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Scaling&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Gets heavier as you add domains&lt;/td&gt;&lt;td&gt;Adding a project is just adding a node&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Vendor&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Tied to one agent runtime&lt;/td&gt;&lt;td&gt;Each project picks its own AI vendor&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Collaboration&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Implicit (shared memory)&lt;/td&gt;&lt;td&gt;Explicit (memos with structured actions)&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;None of the major agent frameworks frame this as a deliberate choice. Anthropic calls their pattern "orchestrator-workers". LangGraph calls it "supervisor". CrewAI calls it "hierarchical process". They describe what the orchestration does, not what shape it takes. The structural decision goes unnamed, which is probably why it goes unexamined.&lt;/p&gt;
&lt;p&gt;The hub model is optimised for one person wearing many hats, like a freelancer who does health tracking, writing, finance, and coding through a single assistant that knows their whole life. My model is optimised for many projects with clear boundaries, where a web development agent for Client A must never see Client B's project data, where cross-project coordination needs to be auditable (every memo is in git), and where different projects might need different AI vendors entirely.&lt;/p&gt;
&lt;p&gt;The word "mesh" has started showing up in enterprise AI writing too, but there it means container orchestration, agent registries, policy enforcement. This is infrastructure for managing fleets of agents across an organisation. What I'm describing is much simpler concept of project directories that talk to each other through plain-text files.&lt;/p&gt;
&lt;p&gt;It's the same underlying insight as PAI and similar tools. The filesystem is the context system, markdown files are the medium, but there's a key difference. Those systems use the filesystem for context while mine uses it for coordination too. Memos route between projects, state files track handoffs, every change lives in git. The filesystem is both the memory and the message bus.&lt;/p&gt;
&lt;h3 id="where-it-stands"&gt;Where It Stands&lt;/h3&gt;
&lt;p&gt;That's the full picture. Working memory, memos, vendor-neutral conventions, specialised agents, a factory to set up new projects, and an architectural structure that none of the major frameworks are talking about yet. The building blocks are markdown files, naming conventions, and structured messages. No database (unless you specifically need it for a project), no platform, and no special software. However, the judgement calls about where to draw project boundaries, how to scope agent responsibilities, and how to keep conventions working reliably across a growing number of projects are what took months of trial and error across real client work.&lt;/p&gt;
&lt;p&gt;As of writing, there are 44 active projects across 14 organisations. This is real production work involving published websites, live content, architecture specifications, server infrastructure. None of this is proof of concept or theorising.&lt;/p&gt;
&lt;p&gt;A few design principles fell out of the process that I think are worth naming:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Vendor-agnostic.&lt;/strong&gt; Works with any AI that can read markdown. No proprietary formats, no lock-in.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Convention-based.&lt;/strong&gt; No runtime dependencies, no database, no platform to maintain. It's files and naming conventions and that's it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Distributed boundaries.&lt;/strong&gt; Each project is its own island. Client A's working files never enter Client B's context unless you explicitly tell an agent to share something. Cross-project communication goes through memos as standard.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Human-in-the-loop.&lt;/strong&gt; I approve every consequential action so the agents do the legwork while I make the decisions.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Right now the whole thing is passive infrastructure. I start a session, the agent picks up where the last one left off, then I close the session. The direction is toward semi-automated, where events trigger sessions and memos route themselves. But that's a story for another post.&lt;/p&gt;
&lt;p&gt;It's not finished (honestly, I doubt it ever will be as it continues to grow and improve), but it handles real client work across multiple organisations every day. There's more to say about how this compares to other approaches and where it goes next, and I'll get to that.&lt;/p&gt;
&lt;p&gt;If you're running multiple projects and spending too much time being the human messenger between them, this approach works. I'm happy to talk through how it might apply to your setup, &lt;a href="https://anothercoffee.net/contact/"&gt;drop me a line&lt;/a&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;This article is part of &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;an ongoing series&lt;/a&gt; on how Another Cup of Coffee is adapting to AI. &lt;a href="https://anothercoffee.net/categories/ai/"&gt;Explore all articles in this series&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;div class="mt-5"&gt;
    &lt;h3&gt;You may also like&lt;/h3&gt;

    &lt;div class="row"&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/run-dozens-of-projects-ai-card-300x150.jpg" class="card-img-top" alt="One person running dozens of projects with AI agents"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/run-dozens-of-projects-with-ai/" class="listtitle"&gt;I Run Dozens of Projects with AI. The Hard Part Isn't the AI.&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2025-12-20T12:00:00Z" title="20 December 2025"&gt;20 December 2025&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;One person, dozens of projects, four AI vendors. I spent a year building a coordination system for AI agents. The components are simple. Getting them right was not.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/why-we-keep-using-chatllm/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/why-we-keep-using-chatllm-card-300x150.jpg" class="card-img-top" alt="ChatLLM by Abacus.AI"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/why-we-keep-using-chatllm/" class="listtitle"&gt;Why We Keep Using ChatLLM Despite Everything That's Wrong With It&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2025-01-20T12:45:15Z" title="20 January 2025"&gt;20 January 2025&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;ChatLLM delivers powerful AI capabilities at a fraction of the cost, despite terrible documentation and non-existent support. Our review reveals how we harness this rough-but-effective tool to provide value.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;
                        &lt;img src="https://anothercoffee.net/images/a-west-london-micro-agencys-journey-to-ai-featured.jpg" class="card-img-top" alt="Coffee and a laptop with ChatGPT"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/" class="listtitle"&gt;Still Alive: A Micro Agency's 20 Year Journey&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2024-10-15T15:28:15Z" title="15 October 2024"&gt;15 October 2024&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                        &lt;p class="card-text flex-grow-1"&gt;This article will be the first in a series where I'll share how Artificial Intelligence has reshaped how we operate at Another Cup of Coffee.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;</description><category>AI</category><category>AOE</category><category>Framework</category><category>Multi-Agent</category><category>Workflow</category><guid>https://anothercoffee.net/building-an-operating-environment-for-ai-agents/</guid><pubDate>Thu, 15 May 2025 13:20:00 GMT</pubDate></item><item><title>Why We Keep Using ChatLLM Despite Everything That's Wrong With It</title><link>https://anothercoffee.net/why-we-keep-using-chatllm/</link><dc:creator>Anthony Lopez-Vito</dc:creator><description>&lt;figure&gt;&lt;img src="https://anothercoffee.net/images/posts/why-we-keep-using-chatllm-og-1200x630.jpg"&gt;&lt;/figure&gt; &lt;p&gt;&lt;strong&gt;This article is &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;part of a series&lt;/a&gt; on our journey and how we at Another Cup of Coffee are adapting to Artificial Intelligence. Competing in a market dominated by larger agencies means we have to be smart and a little bit scrappy, willing to experiment with new tools and alternative ways of working.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Over the past two years, AI has become a crucial part of our strategy, helping us punch above our weight and deliver more value to our clients. In this post, I take a closer look at a tool that has become central to our workflow: &lt;strong&gt;ChatLLM from Abacus.AI&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;ChatLLM has been an incredibly useful addition to our toolkit, but it's certainly not without its drawbacks. While Abacus.AI's pricing makes their offering compelling, there are many areas where it falls short, such as in poor documentation, almost no support, and unreliable implementation. Nevertheless, ChatLLM has been a valuable tool, and I'll dive into our experiences with it, covering both the positives and the frustrations.&lt;/p&gt;
&lt;p&gt;This post &lt;em&gt;has not&lt;/em&gt; been sponsored by Abacus.AI, but if you're interested in ChatLLM, we have a &lt;a href="https://chatllm.abacus.ai/ZyjcktrvCW" target="_blank" rel="nofollow noopener noreferrer"&gt;referral link&lt;/a&gt; for those who want to give it a try.&lt;/p&gt;
&lt;h2 id="how-ai-has-changed-our-work"&gt;How AI Has Changed Our Work&lt;/h2&gt;
&lt;p&gt;Before I get into ChatLLM specifically, it's worth briefly mentioning how AI has helped our day-to-day work.
We've found that with the right AI tools, we can take on projects that would normally need more people, and we spend a lot less time on boring repetitive tasks, such as data cleaning and processing.&lt;/p&gt;
&lt;p&gt;The quality of what we deliver has gone up and clients get better value while we continue to keep our pricing competitive. There have been many cases where we've been able to offer insights that used to require bringing in highly-paid specialists.&lt;/p&gt;
&lt;p&gt;For example, there have been a number of projects where we've been able to save our clients money by delivering close-to-final draft documentation and reports that required specialist expertise. Instead of immediately hiring costly third-party consultants, clients simply had our AI-generated drafts reviewed by their in-house counsel, significantly reducing professional fees.&lt;/p&gt;
&lt;p&gt;We've also cut expenses by using AI assistants to rapidly code custom, single-purpose tools rather than purchasing expensive off-the-shelf software or subscriptions. Those costs can add up significantly, especially for small agencies like ours. Such bespoke utilities to solve specific one-time project challenges could not have been feasible without AI.&lt;/p&gt;
&lt;p&gt;Being an early mover into AI integration has given us insight into the massive shifts coming to the workforce and to society in general. We have already started adapting our business and skills to thrive in this new environment so I'm now more confident about Another Cup of Coffee's future.&lt;/p&gt;
&lt;h2 id="what-were-using"&gt;What We're Using&lt;/h2&gt;
&lt;p&gt;We've tried loads of AI platforms over the past couple of years. Some stuck around but many others didn't. Aside from ChatLLM, we regularly use a range of other tools including:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;s&gt;&lt;a href="https://www.augmentcode.com" target="_blank" rel="noopener nofollow"&gt;Augment&lt;/a&gt;&lt;/s&gt; &lt;sup&gt;*&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://claude.com/product/claude-code" target="_blank" rel="noopener nofollow"&gt;Claude Code&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;s&gt;&lt;a href="https://github.com/google-gemini/gemini-cli" target="_blank" rel="noopener nofollow"&gt;Gemini CLI&lt;/a&gt;&lt;/s&gt; &lt;sup&gt;*&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://ai.google.dev/gemini-api/docs/ai-studio-quickstart" target="_blank" rel="noopener nofollow"&gt;Google AI Studio&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://groq.com" target="_blank" rel="noopener nofollow"&gt;Groq&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://julius.ai/" target="_blank" rel="noopener nofollow"&gt;Julius&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://mistral.ai" target="_blank" rel="noopener nofollow"&gt;Mistral's Le Chat&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://www.napkin.ai" target="_blank" rel="noopener nofollow"&gt;napkin.ai&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://notebooklm.google.com/" target="_blank" rel="noopener nofollow"&gt;NotebookLM&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://ollama.com/" target="_blank" rel="noopener nofollow"&gt;Ollama&lt;/a&gt; with &lt;a href="https://github.com/open-webui/open-webui" target="_blank" rel="noopener nofollow"&gt;Open WebUI&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://replit.com/refer/anothercoffee" target="_blank" rel="noopener nofollow"&gt;Replit&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;s&gt;&lt;a href="https://codeium.com/windsurf" target="_blank" rel="noopener nofollow"&gt;Windsurf&lt;/a&gt;&lt;/s&gt; &lt;sup&gt;*&lt;/sup&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p class="text-muted small"&gt;October 2025 update: We've replaced a number of tools with Claude Code, which has proven superior.&lt;/p&gt;

&lt;p&gt;They all have their specific use-cases but ChatLLM is definitely the one I personally reach for most often.&lt;/p&gt;
&lt;h2 id="chatllm-from-abacusai-whats-good-and-whats-not"&gt;ChatLLM from Abacus.AI: What's Good and What's Not&lt;/h2&gt;
&lt;h3 id="what-is-it"&gt;What Is It?&lt;/h3&gt;
&lt;p&gt;As its name implies, ChatLLM is Abacus.AI's chat-based AI platform that gives you access to a bunch of different LLM models through one interface. You pay $10 US Dollars per user per month, and get access to the most popular models like ChatGPT, Claude Sonnet, Gemini and DeepSeek. Models are released so frequently that I haven't bothered to include the version numbers.&lt;/p&gt;
&lt;p&gt;Abacus.AI doesn't just give access to multiple models. You get specialised tools like CodeLLM, their AI-assisted VS Code editor, and AI Engineer to help you build custom chatbots and AI agents.&lt;/p&gt;
&lt;h3 id="why-we-like-it"&gt;Why We Like It&lt;/h3&gt;
&lt;p&gt;It's cheap. That's it. $10 a month for all those models is excellent value as we'd be paying over $50-$100 per month if we subscribed our most-used models individually. This means we can use the strengths of the different models rather than trying to make one model do everything.&lt;/p&gt;
&lt;p&gt;We use it for all sorts of things: writing first drafts of client proposals; creating documentation or reports; helping debug and improve code; project planning; summarising project progress; and general brainstorming. It's not all great and the low price comes at other costs.&lt;/p&gt;
&lt;h3 id="the-annoying-bits"&gt;The Annoying Bits&lt;/h3&gt;
&lt;h4 id="poor-documentation"&gt;Poor Documentation&lt;/h4&gt;
&lt;p&gt;The documentation is absolutely terrible, outdated and severely incomplete. Help pages casually reference features and interface elements without any explanation, expecting you to be familiar with their terminology for the various screens and settings. I've wasted so much time trying to figure out basic stuff that should be clearly explained.&lt;/p&gt;
&lt;h4 id="no-customer-support"&gt;No Customer Support&lt;/h4&gt;
&lt;p&gt;Customer support is also non-existent. I've sent questions about specific features and they go into a black hole. Good luck to you if you encounter problems. I suspect they're putting all their effort in supporting enterprise clients. If you're a small or medium-sized business, this will no doubt make you wonder if this is a solution you can trust with production-grade tasks. Right now, we don't use it for any client-facing solutions and a big reason is lack of customer support.&lt;/p&gt;
&lt;h4 id="terrible-interface"&gt;Terrible Interface&lt;/h4&gt;
&lt;p&gt;The interface is horrible too. It feels like it was designed by engineers who just bolted on features and stuck them in weird places. Aspects such as their version of the ChatGPT Playground are a complete kludge. You can't expect the kind of seamless interaction with your work that's available on ChatGPT's user interface.&lt;/p&gt;
&lt;h4 id="codellm-a-waste-of-time"&gt;CodeLLM - A Waste of Time?&lt;/h4&gt;
&lt;p&gt;CodeLLM is Abacus.AI's answer to AI-assisted code editors like CoPilot, Cursor and Windsurf. However, it feels like someone's half-hearted attempt at a side project, thrown together after hours and released, then forgotten. I won't even bother listing its problems and I recommend not wasting your time even trying it. Maybe CodeLLM will get better over time but I really don't see why it's available at all. Stick with the more well-known options.&lt;/p&gt;
&lt;h4 id="unreliable-custom-agents"&gt;Unreliable Custom Agents&lt;/h4&gt;
&lt;p&gt;When this review was first posted in January 2025, I mentioned my love-hate relationship with the custom agent feature. Back then, I found it useful for building agents to handle tasks we do regularly, such as generating documents, manipulating our back-office financial transactions, or providing chat access to local knowledge bases. I also mentioned problems with poor documentation and unexpectedly burning through credits using the AI Engineer feature.&lt;/p&gt;
&lt;p&gt;Now I've stopped using it altogether because they've become unreliable. For some reason, previously working agents will hang, producing no output, or spit out an error. I did try contacting Abacus.AI support but there was no response or even acknowledgement that they'd received my message.&lt;/p&gt;
&lt;figure class="figure d-flex flex-column align-items-center"&gt;
&lt;img src="https://anothercoffee.net/images/posts/why-we-keep-using-chatllm-agents-error.png" alt="ChatLLM custom agents error" title="ChatLLM custom agents error" class="img-fluid"&gt;
&lt;/figure&gt;

&lt;h4 id="unclear-pricing-and-service-limits"&gt;Unclear Pricing and Service Limits&lt;/h4&gt;
&lt;p&gt;You'd think that at $10 per user per month, you'll have a good idea of what you'll be paying. But that's not the case. Each user is allocated 2,000,000 compute points per month. What exactly that means is not at all clear. Here's Abacus.AI's explanation:&lt;/p&gt;
&lt;blockquote class="blockquote red p-0"&gt;
    &lt;p&gt;&lt;em&gt;"Compute points are NOT TOKENS. They are simply a measure of usage of ChatLLM. With 2M compute points, you will be able to send 50,000+ messages on some LLMs in a month. 1,000,000 (1M) compute points can be as much as 70,000,000 (70M) tokens on some LLMs."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;OK, so 50,000+ messages sound a lot and most months we're well within the limit. However, a few times I managed to burn through most of my compute points in a couple of afternoons of using AI Engineer. How? No idea.&lt;/p&gt;
&lt;h3 id="new-features-mid-2025"&gt;New Features (mid-2025)&lt;/h3&gt;
&lt;p&gt;Abacus.AI has added a few more features to ChatLLM since I originally wrote this post:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;DeepAgent:&lt;/strong&gt; an autonomous agent that can help you with tasks like research, planning or building apps.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tasks:&lt;/strong&gt; a scheduling and automation utility designed to run AI-powered actions at certain times, intervals, or triggers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Apps:&lt;/strong&gt; no-code and full-code methods for launching apps.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Projects:&lt;/strong&gt; a feature to organize chats, files, and workflow instructions into folder workspaces.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;RouteLLM:&lt;/strong&gt; a routing mechanism to send your queries to the best underlying language model.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Because of my poor experience with ChatLLM Custom Agents, I really haven't bothered to try out DeepAgent, Tasks or Apps. These new features may be worth some experimentation if I were a reviewer or YouTube AI influencer, but my focus is on getting work done. When you're running a business, you rarely have time to play with new things, and I feel that Abacus.AI wasted my time with Custom Agents.&lt;/p&gt;
&lt;p&gt;Projects and RouteLLM are different though. They're built-in improvements to the chat interface and are probably my two most favourite ChatLLM additions so I'll spend a bit of time talking about them.&lt;/p&gt;
&lt;h4 id="chatllm-projects"&gt;ChatLLM Projects&lt;/h4&gt;
&lt;p&gt;This feature is exactly like ChatGPT Projects and there's not much explanation needed here. You can organise chats into folders for easy reference. They're perfect for working on different areas within your business. In some ways, it has replaced ChatLLM's flaky Custom Agents implementation for my needs.&lt;/p&gt;
&lt;p&gt;I'll have different 'agents' in project folders where I copy and paste prompts from old chats to get certain types of work done. For example, I'll have projects for proposals, coding tasks and client needs, like contract review or support requests. Each project can have its own custom instructions and files for context. This is perfect if you want to quickly create a task-specific AI without having to mess around with building an actual agent.&lt;/p&gt;
&lt;h4 id="routellm"&gt;RouteLLM&lt;/h4&gt;
&lt;p&gt;When Abacus.AI first deployed RouteLLM, it was just one model among many alongside GPT, Claude, and Gemini. Now it's the default option. Your chats will go through RouteLLM unless you manually set your model.&lt;/p&gt;
&lt;p&gt;It does a fairly good job of routing your chat to the right model but seems to favour OpenAI's GPT-5. This could be for my own use-cases though. It will sometimes inappropriately route queries to Gemini 2.5 Flash, and I suspect this is Abacus.AI trying to reduce their API costs.&lt;/p&gt;
&lt;p&gt;But what's really great is the &lt;em&gt;Regenerate using&lt;/em&gt; option. If you're not quite happy with a response, or you want to see how a different model will respond, just click the model button underneath. This is a massive time-saver. The alternative would be to switch your session to another model provider or use &lt;em&gt;yet another&lt;/em&gt; service like Boxchat or chatplayground.ai.&lt;/p&gt;
&lt;figure class="figure d-flex flex-column align-items-center mb-4"&gt;
&lt;img src="https://anothercoffee.net/images/posts/why-we-keep-using-chatllm-regenerate.png" alt="ChatLLM custom agents error" title="ChatLLM custom agents error" class="img-fluid"&gt;
&lt;/figure&gt;

&lt;h2 id="is-chatllm-worth-the-price"&gt;Is ChatLLM Worth the Price?&lt;/h2&gt;
&lt;p&gt;Despite all those frustrations, ChatLLM is the AI tool we use most, and this is still the case many months after originally posting this review. The value it provides is just too good to ignore. We've become used to its shortcomings and found ways around its poor interface.&lt;/p&gt;
&lt;p&gt;I've been tempted to re-subscribe to ChatGPT or Claude whenever they release a new model but stopped to think: is it worth the extra expense when they'll be available on ChatLLM within days?&lt;/p&gt;
&lt;p&gt;If you're a small agency or freelancer and you can put up with some friction in exchange for powerful capabilities at an affordable price, ChatLLM is definitely worth looking at. Just don't think you'll get anywhere near a polished experience or helpful support for your $10.&lt;/p&gt;
&lt;h2 id="being-transparent"&gt;Being Transparent&lt;/h2&gt;
&lt;p&gt;I should mention that Abacus.AI is active in advertising through content creators, especially on YouTube. We haven't received anything for this review and I'm simply sharing our experience as paying customers. I genuinely find the tool useful and worth recommending, despite its many problems.&lt;/p&gt;
&lt;h2 id="give-it-a-try"&gt;Give It a Try&lt;/h2&gt;
&lt;p&gt;If you're curious about what ChatLLM might do for your business, I think it's worth trying for the price of a couple of cups of coffee. Use &lt;a href="https://chatllm.abacus.ai/ZyjcktrvCW" target="_blank" rel="nofollow noopener noreferrer"&gt;our referral link here&lt;/a&gt;. &lt;span class="text-muted"&gt;(We'll get $5 if you sign up but they don't say what you'll get. Probably nothing.)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;I certainly won't call myself an Abacus.AI fan, but I do think other small agencies and freelancers might benefit greatly from this tool, warts and all. For us, the trade-offs have been worth the price and like any tool, what matters is whether it fits your specific needs and how you work. ChatLLM definitely isn't perfect, but it's been really useful for us so for the price, you'll lose very little to find out if it works for you too.&lt;/p&gt;
&lt;div class="button-container mt-4 mb-2"&gt;
  &lt;a href="https://chatllm.abacus.ai/ZyjcktrvCW" target="_blank" rel="nofollow noopener noreferrer" class="btn btn-primary text-white mr-3"&gt;
    Try ChatLLM Now
  &lt;/a&gt;
&lt;/div&gt;

&lt;p class="text-center text-muted small mb-4"&gt;
    Use our referral link to try ChatLLM.
&lt;/p&gt;

&lt;hr&gt;

&lt;section class="mt-4 pt-4"&gt;
    &lt;h3&gt;You may also like&lt;/h3&gt;

    &lt;div class="row"&gt;

        &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://migratecontent.com/drupal-to-wordpress-migration-guide/"&gt;
                        &lt;img src="https://anothercoffee.net/images/drupal-to-wordpress-migration-utilities-featured.jpg" class="card-img-top" alt="Drupal to WordPress Migration Guide"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://migratecontent.com/drupal-to-wordpress-migration-guide/" class="listtitle"&gt;Drupal to WordPress Migration Guide&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2025-01-03T15:30:30Z" title="Updated for 2025"&gt;Updated for 2025&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;
                    &lt;p class="card-text flex-grow-1"&gt;In this guide, you'll find insights drawn from almost 15 years of specialising in complex Drupal to WordPress migration projects. I'll walk you through the entire migration process, from the initial evaluation to post-launch considerations.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

        &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
        &lt;div class="card h-100"&gt;
              &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;
                      &lt;img src="https://anothercoffee.net/images/a-west-london-micro-agencys-journey-to-ai-featured.jpg" class="card-img-top" alt="Still Alive: A Micro Agency's 20 Year Journey"&gt;&lt;/a&gt;
              &lt;div class="card-body d-flex flex-column"&gt;
                  &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/" class="listtitle"&gt;Still Alive: A Micro Agency's 20 Year Journey&lt;/a&gt;&lt;/h4&gt;
                  &lt;div class="mb-2"&gt;
                      &lt;span&gt;&lt;time class="listdate" datetime="2024-10-15T15:28:15Z" title="15 October 2024"&gt;15 October 2024&lt;/time&gt;&lt;/span&gt;
                  &lt;/div&gt;

                      &lt;p class="card-text flex-grow-1"&gt;This article will be the first in a series where I'll share how Artificial Intelligence has reshaped how we operate at Another Cup of Coffee.&lt;/p&gt;
              &lt;/div&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;!--
      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
        &lt;div class="card h-100"&gt;
              &lt;a href="/still-alive-a-micro-agencys-20-year-journey/"&gt;
                      &lt;img src="/images/a-west-london-micro-agencys-journey-to-ai-featured.jpg" class="card-img-top" alt="Still Alive: A Micro Agency's 20 Year Journey"&gt;&lt;/a&gt;
              &lt;div class="card-body d-flex flex-column"&gt;
                  &lt;h3 class="card-title"&gt;&lt;a href="/still-alive-a-micro-agencys-20-year-journey/" class="listtitle"&gt;Still Alive: A Micro Agency's 20 Year Journey&lt;/a&gt;&lt;/h3&gt;
                  &lt;div class="mb-2"&gt;
                      &lt;span&gt;&lt;time class="listdate" datetime="2024-10-15T15:28:15Z" title="15 October 2024"&gt;15 October 2024&lt;/time&gt;&lt;/span&gt;
                  &lt;/div&gt;

                      &lt;p class="card-text flex-grow-1"&gt;This article will be the first in a series where I'll share how Artificial Intelligence has reshaped how we operate at Another Cup of Coffee.&lt;/p&gt;
              &lt;/div&gt;
        &lt;/div&gt;
      &lt;/div&gt;
    --&gt;
            &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
          &lt;div class="card h-100"&gt;
              &lt;a href="https://anothercoffee.net/secure-your-ai-workflow-using-local-tokenisation/"&gt;
                      &lt;img src="https://anothercoffee.net/images/Secure-your-AI-workflow-using-local-tokenisation-in-PaigeSafe-featured.jpg" class="card-img-top" alt="Secure Your AI Workflow Using Local Tokenisation"&gt;&lt;/a&gt;
              &lt;div class="card-body d-flex flex-column"&gt;
                  &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/secure-your-ai-workflow-using-local-tokenisation/" class="listtitle"&gt;Secure Your AI Workflow Using Local Tokenisation&lt;/a&gt;&lt;/h4&gt;
                  &lt;div class="mb-2"&gt;
                      &lt;span&gt;&lt;time class="listdate" datetime="2024-11-12T13:59:03Z" title="12 November 2024"&gt;12 November 2024&lt;/time&gt;&lt;/span&gt;
                  &lt;/div&gt;

                      &lt;p class="card-text flex-grow-1"&gt;Don't leak confidential client data when using cloud-based LLMs. Secure your AI workflow with local tokenisation using PaigeSafe.&lt;/p&gt;
              &lt;/div&gt;
          &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

&lt;/section&gt;</description><category>Agency</category><category>AI</category><category>LLM</category><category>Operations</category><category>Productivity</category><category>Tools</category><category>Utilities</category><category>Workflow</category><guid>https://anothercoffee.net/why-we-keep-using-chatllm/</guid><pubDate>Mon, 20 Jan 2025 12:45:15 GMT</pubDate></item><item><title>Still Alive: A Micro Agency's 20 Year Journey - Part 2</title><link>https://anothercoffee.net/still-alive-part2-balance/</link><dc:creator>Anthony Lopez-Vito</dc:creator><description>&lt;figure&gt;&lt;img src="https://anothercoffee.net/images/a-west-london-micro-agencys-journey-p2-og-1200x630.jpg"&gt;&lt;/figure&gt; &lt;p&gt;&lt;em&gt;In Part 1 of &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;Still Alive: A Micro Agency's 20 Year Journey&lt;/a&gt;, I reflected on Another Cup of Coffee's journey from a one-person freelancer operation running from a rented mailbox address, to setting up in a trendy unit Westbourne Studios. Here I recount the initial challenges we faced as an agency and how we survived by transforming the way we worked.&lt;/em&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="finding-a-balance-growth-and-resources"&gt;Finding a Balance: Growth and Resources&lt;/h2&gt;
&lt;p&gt;A few doors down sat what I wanted for our future. It was a large and successful media agency staffed with very confident ex-BBC people. They had loads of sales reps and account managers. I think it was among the first generation of the global web agencies we have today.&lt;/p&gt;
&lt;p&gt;The founder was famous in our circles for saying that he didn't bother getting out of bed for less than £30,000. That amount is common for web projects now but saying this in the early 2000s was quite grandiose and flamboyant. Though we were charging a fraction of their rates, I imagined someday reaching their scale.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;I envisaged scaling to a large agency but we were caught in a classic trap. None of us were sales people and we struggled to balance resources.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;However, running a small web agency in London is challenging to say the least and scaling was the biggest problem. We were too small for large projects, but small projects wouldn't pay for growth. It was a classic trap.&lt;/p&gt;
&lt;h3 id="sell-and-keep-selling"&gt;Sell and Keep Selling?&lt;/h3&gt;
&lt;p&gt;Looking back, I think the key to growing an agency of any size is to just sell. Sell and keep selling even if you're not sure you can deliver. This is why the large agency had so many sales and account managers. Other agencies I've observed over the years seem to have grown in this way: they are sales-heavy and make lots of nice promises. But I've never been comfortable with making promises unless I'm sure of them so our growth remained stunted.&lt;/p&gt;
&lt;p&gt;Meanwhile, our office space wasn't really free. It came with a heavy cost in the form of commitments that took my time away from building the business. The elephant in the room wasn't that big agency but our premature attempt at playing big.&lt;/p&gt;
&lt;p&gt;It couldn't be ignored for very long as I struggled to bring in the right projects to ensure everyone was paid. It became clear that the traditional office setup wasn't sustainable and I had to make some difficult choices, quickly.&lt;/p&gt;
&lt;h3 id="transforming-into-a-cloud-first-virtual-agency"&gt;Transforming Into a Cloud-first Virtual Agency&lt;/h3&gt;
&lt;figure class="figure d-flex flex-column align-items-center"&gt;
&lt;img src="https://anothercoffee.net/images/posts/westbourne-studios-goodbye.jpg" alt="Empty unit at Westbourne Studios" title="Goodbye Westbourne Studios" class="img-fluid" style="max-width: 100%; height: auto;"&gt;
&lt;figcaption class="figure-caption text-center mt-2"&gt;Goodbye Westbourne Studios&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Survival often means letting go of how you imagine things should be. Our biggest advantage was that we were scrappy and nimble so I decided to use it. We didn't actually need a trendy office and everyone was more productive without the daily commute into West London.&lt;/p&gt;
&lt;p&gt;Within a couple of weeks of making the decision, we transformed into a cloud-first operation. I also hired more freelancer friends from the Philippines to fill our skills gap. Suddenly, Another Cup of Coffee was a fully remote virtual agency, years before the concept became popular.&lt;/p&gt;
&lt;h3 id="digital-nomad-and-prevailing"&gt;Digital Nomad and Prevailing&lt;/h3&gt;
&lt;p&gt;Being cloud-first and remote meant that I could do the digital nomad thing for a while too, working out of all sorts of unconventional places. This opened up the opportunity to pick up clients globally. Even now, the majority of our projects come from outside the UK.&lt;/p&gt;
&lt;!--
&lt;figure class="figure d-flex flex-column align-items-center"&gt;
&lt;img src="/images/posts/digital-nomad-philippines.jpg"
    alt="Photo a palm tree offering the only spot for internet access"
    title="Life as a digital nomad - the constant search for internet access"
        class="img-fluid" 
        style="max-width: 100%; height: auto;"&gt;
&lt;figcaption class="figure-caption text-center mt-2"&gt;Life as a digital nomad - the constant search for internet access&lt;/figcaption&gt;
&lt;/figure&gt;
--&gt;
&lt;p&gt;Karl and Pafsanias moved on to other—almost certainly greener—pastures but by then we'd evolved into a collective of remote colleagues, forming teams as projects demanded. This lean and agile model allowed us to weather storms that have since sunk many of our contemporaries.&lt;/p&gt;
&lt;p&gt;I can say with some sense of accomplishment that Another Cup of Coffee has outlasted all three of the ventures who shared our unit, and also that large agency I'd once admired. The key was our ability to try out different ways of working. The founders quite possibly sold the business and exited with large bonuses, but I'm after something different: an establishment that can be handed to another generation. (More about this will follow in another post.)&lt;/p&gt;
&lt;h2 id="moving-towards-specialisation"&gt;Moving Towards Specialisation&lt;/h2&gt;
&lt;p&gt;Our early advantages began to fade as remote work became mainstream and sites like Elance and oDesk gave even small businesses access to global talent. This rise in cross-border competition made it difficult to stand out by only offering general website services, and even our strong relationships with existing clients wouldn't guarantee the cash flow needed to keep the business afloat.&lt;/p&gt;
&lt;p&gt;At the same time, &lt;a href="https://migratecontent.com/drupal-the-dreaded-cms/"&gt;Drupal's growing complexity&lt;/a&gt; turned maintenance into a huge drain on resources. Again it was time to adapt and the answer came from an unexpected direction.&lt;/p&gt;
&lt;p&gt;I'd spotted a need for content migrations, specifically &lt;a href="https://anothercoffee.net/drupal-to-wordpress-migration-service/" title="Drupal to WordPress Migration Service
"&gt;from Drupal to WordPress&lt;/a&gt;, and built up an expertise in data-heavy web projects. It was unglamorous and mundane engine-room work so few agencies were interested in training up their talent for such a niche skill.&lt;/p&gt;
&lt;figure class="figure d-flex flex-column align-items-center"&gt;
&lt;img src="https://anothercoffee.net/images/drupal-to-wordpress-migration-tool-screenshot.jpg" alt="Screenshot of our Drupal to WordPress Migration Tool" title="Drupal to WordPress Migration Tool" class="img-fluid" style="max-width: 100%; height: auto;"&gt;
&lt;figcaption class="figure-caption text-center mt-2"&gt;Version 1 of our custom-built Drupal to WordPress Migration Tool&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;h3 id="leaning-into-your-nature"&gt;Leaning Into Your Nature&lt;/h3&gt;
&lt;p&gt;But it was necessary work that happened to fit my meticulous nature and long experience with databases. Further, businesses were beginning to realise the value of data and content while most of our colleagues were focused on dazzling user experiences. Data was our differentiator.&lt;/p&gt;
&lt;p&gt;Another Cup of Coffee shifted to becoming a boutique data migration consultancy with myself as principal consultant supported by a few freelancers. Unexpectedly, I became a pioneer of &lt;a href="https://anothercoffee.net/drupal-to-wordpress-migration-service/"&gt;Drupal to WordPress migration services&lt;/a&gt; and was recommended by WP Engine in a &lt;a href="https://wpengine.com/wp-content/uploads/2017/02/WP-WP-MigratingfromDrupalToWordPress-05-PUB.pdf" target="_blank" rel="nofollow noopener noreferrer"&gt;whitepaper guide&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This ultra-narrow specialisation gave Another Cup of Coffee a much-needed distinction for almost ten years. Run a web search now and you'll see many solutions and services for content migrations but we were one of the first.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;Ultra-narrow specialisation gave us a differentiator in a crowded marketplace. Willingness to change helped us survive when confronted by something new.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;h2 id="looking-forward"&gt;Looking Forward&lt;/h2&gt;
&lt;p&gt;Since those early days, we've kept evolving alongside shifts in technology and ways of working. I can't claim that any of it was planned but I can say we've always been willing to change, just slightly ahead of our peers. Throughout this journey, I've learned that survival isn't about being the biggest or the most innovative. You survive by keeping customers happy and watching out for dust on the horizon, ready to move before a stampede arrives. &lt;/p&gt;
&lt;p&gt;We once embraced virtual offices, the cloud-first paradigm and global remote working before they were common. Now we find ourselves very suddenly confronted by something new. I think it's obvious that AI will be disrupting our lives in ways that will be hard to avoid. It's again time to learn to adapt just as we always have.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This is the first in a series of posts about our journey and &lt;a href="https://anothercoffee.net/categories/ai/" title="Posts "&gt;how we're adapting to Artificial Intelligence in our lives&lt;/a&gt;. I do hope you'll follow along with me as I share what I've learned.&lt;/em&gt;&lt;/p&gt;
&lt;iframe class="youtube-video" src="https://www.youtube.com/embed/Y6ljFaKRTrI?si=_Gya4JdG9wtHzYeT" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen&gt;&lt;/iframe&gt;

&lt;p class="text-muted"&gt;'Still Alive' from the &lt;a href="https://theportalwiki.com/wiki/Main_Page" target="_blank" rel="nofollow noopener noreferrer"&gt;Portal&lt;/a&gt; game credits. I never played the game but I've always enjoyed the sound and lyrics.&lt;/p&gt;

&lt;section class="mt-4 pt-4"&gt;
    &lt;h3&gt;You may also like&lt;/h3&gt;

    &lt;div class="row"&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
        &lt;div class="card h-100"&gt;
              &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;
                      &lt;img src="https://anothercoffee.net/images/a-west-london-micro-agencys-journey-to-ai-featured.jpg" class="card-img-top" alt="Still Alive: A Micro Agency's 20 Year Journey"&gt;&lt;/a&gt;
              &lt;div class="card-body d-flex flex-column"&gt;
                  &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/" class="listtitle"&gt;Still Alive: A Micro Agency's 20 Year Journey&lt;/a&gt;&lt;/h4&gt;
                  &lt;div class="mb-2"&gt;
                      &lt;span&gt;&lt;time class="listdate" datetime="2024-10-15T15:28:15Z" title="15 October 2024"&gt;15 October 2024&lt;/time&gt;&lt;/span&gt;
                  &lt;/div&gt;

                      &lt;p class="card-text flex-grow-1"&gt;This article will be the first in a series where I'll share how Artificial Intelligence has reshaped how we operate at Another Cup of Coffee.&lt;/p&gt;
              &lt;/div&gt;
        &lt;/div&gt;
      &lt;/div&gt;

            &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
          &lt;div class="card h-100"&gt;
              &lt;a href="https://anothercoffee.net/secure-your-ai-workflow-using-local-tokenisation/"&gt;
                      &lt;img src="https://anothercoffee.net/images/Secure-your-AI-workflow-using-local-tokenisation-in-PaigeSafe-featured.jpg" class="card-img-top" alt="Secure Your AI Workflow Using Local Tokenisation"&gt;&lt;/a&gt;
              &lt;div class="card-body d-flex flex-column"&gt;
                  &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/secure-your-ai-workflow-using-local-tokenisation/" class="listtitle"&gt;Secure Your AI Workflow Using Local Tokenisation&lt;/a&gt;&lt;/h4&gt;
                  &lt;div class="mb-2"&gt;
                      &lt;span&gt;&lt;time class="listdate" datetime="2024-11-12T13:59:03Z" title="12 November 2024"&gt;12 November 2024&lt;/time&gt;&lt;/span&gt;
                  &lt;/div&gt;

                      &lt;p class="card-text flex-grow-1"&gt;Don't leak confidential client data when using cloud-based LLMs. Secure your AI workflow with local tokenisation using PaigeSafe.&lt;/p&gt;
              &lt;/div&gt;
          &lt;/div&gt;
      &lt;/div&gt;

        &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://migratecontent.com/drupal-7-docker-containers-migration-projects/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/Drupal-Docker-Containers-card-300-150.jpg" class="card-img-top" alt="How To Set Up Drupal 7 Docker Containers for Migration Projects"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://migratecontent.com/drupal-7-docker-containers-migration-projects/" class="listtitle"&gt;How To Set Up Drupal 7 Docker Containers for Migration Projects&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2024-09-09T13:25:15Z" title="09 September 2024"&gt;09 September 2024&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;

                        &lt;p class="card-text flex-grow-1"&gt;Learn how Docker is a valuable tool for Drupal 7 end of life migrations. In this post, I'll give a step-by-step guide to setting up a Drupal 7 container for your migration project.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;

&lt;/section&gt;

&lt;div class="mt-4 pt-4 text-muted small border-top border-bottom"&gt;
    &lt;h3 class="text-muted small"&gt;Footnotes&lt;/h3&gt;
    &lt;ul&gt;
        &lt;li&gt;&lt;p&gt;Featured image photo by &lt;a href="https://unsplash.com/@nueni74?utm_content=creditCopyText&amp;amp;utm_medium=referral&amp;amp;utm_source=unsplash" target="_blank" rel="nofollow noopener noreferrer"&gt;Lil Mayer&lt;/a&gt;.
      &lt;/p&gt;&lt;/li&gt;
        &lt;li&gt;Still Alive is by Jonathan Coulton. The official video with Sara Quin and Dorit Chrysler can be &lt;a href="https://www.youtube.com/watch?v=RSsstXfcRWw" target="_blank" rel="nofollow noopener noreferrer"&gt;found here&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
&lt;/div&gt;</description><category>About Us</category><category>Agency</category><category>AI</category><category>Business</category><category>LLM</category><category>Operations</category><category>Startups</category><category>Workflow</category><guid>https://anothercoffee.net/still-alive-part2-balance/</guid><pubDate>Thu, 14 Nov 2024 15:28:15 GMT</pubDate></item><item><title>Secure Your AI Workflow Using Local Tokenisation</title><link>https://anothercoffee.net/secure-your-ai-workflow-using-local-tokenisation/</link><dc:creator>Anthony Lopez-Vito</dc:creator><description>&lt;figure&gt;&lt;img src="https://anothercoffee.net/images/Secure-your-AI-workflow-using-local-tokenisation-in-PaigeSafe-featured.jpg"&gt;&lt;/figure&gt; &lt;p&gt;&lt;strong&gt;&lt;em&gt;Secure your AI workflow with local tokenisation. PaigeSafe is a lightweight tool perfect for small agencies and freelancers handling sensitive client data in ChatGPT, Claude and other AI tools.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If you've spent any time at all using cloud-based LLMs like ChatGPT or Claude for client work, you've probably had that voice in the back of your head kick in: &lt;em&gt;"Should I really be pasting this into a chat?"&lt;/em&gt; I'm sure that moment of hesitation is all too familiar for those who have started to integrate AI into work workflows.&lt;/p&gt;
&lt;p&gt;Every day, many of us paste sensitive content into AI tools—client data, business strategies, internal documents—often without really thinking about where that information ends up. That data potentially becomes part of training sets, risking leaks by cropping up in future chats with other users. For freelancers and small agencies handling confidential client work, Large Language Models (LLMs) create a real dilemma. They're too useful to avoid but carefully sanitising content is a real chore.&lt;/p&gt;
&lt;p&gt;Enterprises solve this with expensive solutions which are overkill and far too expensive for the rest of us. Those who want to take advantage of LLMs have been left with carefully reading through documents and running manual search and replace for names and numbers. This is tedious, error-prone and still stands a high likelihood of data leaks. Unfortunately, taking unnecessary risks with client data, spending ages on manual anonymisation, or avoiding AI tools altogether when working with sensitive information is no longer a good option to remain competitive.&lt;/p&gt;
&lt;h3 id="introducing-the-paigesafe-document-security-tool"&gt;Introducing the PaigeSafe Document Security Tool&lt;/h3&gt;
&lt;p&gt;&lt;img alt="Screenshot of PaigeSafe" src="https://anothercoffee.net/images/PaigeSafe-Tokenize-Text.jpg"&gt;&lt;/p&gt;
&lt;p&gt;PaigeSafe is a document security tool that helps protect your confidential information when using Large Language Models (LLMs) like ChatGPT and Claude. It uses the process of tokenisation by replacing sensitive data with non-sensitive placeholders. We originally built it as an in-house tool because we faced these exact same challenges. As a small team, we needed something that just worked without the expensive licenses and high learning curve.&lt;/p&gt;
&lt;p&gt;PaigeSafe is currently in the prototyping stage to test if there is demand for this type of utility. It offers basic functionality, and the code lacks robust error checking. However, since it is intended to be run locally, there is minimal risk to your documents. All it does is offer a convenient way to search and replace text before you paste or upload sensitive text to LLMs. I regularly use it to sanitise my own documents.&lt;/p&gt;
&lt;h3 id="uses-and-limitations"&gt;Uses and Limitations&lt;/h3&gt;
&lt;p&gt;PaigeSafe does not try to offer an enterprise solution for those who need to meet strict compliance regulations. Here's where it fits in the document security landscape:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Perfect for&lt;/strong&gt;: Freelancers, small agencies, independent developers&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Good for&lt;/strong&gt;: Regular business documents, client communications, project data&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Not for&lt;/strong&gt;: Banking systems, medical records, top-secret government files&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;PaigeSafe is a lightweight tool that helps you avoid accidentally exposing sensitive information to AI models. If you're handling typical client work like website data, marketing plans, business strategies, and project specs, this solution is for you. It's perfect for those, "I need to run this past ChatGPT but shouldn't share the client's name" moments. Or when you want to analyze customer feedback without exposing individual identities.&lt;/p&gt;
&lt;p&gt;If you work for a financial institution, healthcare provider, or government contractor, this solutions of course will not be for you.&lt;/p&gt;
&lt;h3 id="where-to-find-it"&gt;Where to Find it&lt;/h3&gt;
&lt;p&gt;The tool is built using Python and the Streamlit framework but if you use Docker, it can be easily installed by pulling the PaigeSafe image from Docker Hub. For more information, please visit the dedicated site at &lt;a href="https://paigesafe.com/"&gt;paigesafe.com&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Remember that it is still very much an early prototype but more useful features will follow. Please send feedback to &lt;a href="mailto:paigesafe@anothercoffee.net"&gt;paigesafe@anothercoffee.net&lt;/a&gt;&lt;/p&gt;
&lt;div class="container my-4 p-4 border bg-light text-center"&gt;
    &lt;h4 class="grid-heading text-center mb-3"&gt;How to install PaigeSafe&lt;/h4&gt;
    &lt;p&gt;Find out how to install the prototype application by following the instructions on the PaigeSafe website.&lt;/p&gt;
    &lt;button type="button" class="btn btn-primary"&gt;&lt;a href="https://paigesafe.com"&gt;Install PaigeSafe&lt;/a&gt;&lt;/button&gt;
&lt;/div&gt;

&lt;hr&gt;

&lt;section class="mt-4 pt-4"&gt;
    &lt;h3&gt;You may also like&lt;/h3&gt;

    &lt;div class="row"&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://migratecontent.com/drupal-7-end-of-life-why-wordpress-is-the-best-migration-option/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/drupal-7-end-of-life-why-wordpress-is-the-best-migration-option-300x150.jpg" class="card-img-top" alt="Drupal 7 End of Life: Why WordPress is the Best Migration Option for Lower Maintenance Sites"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://migratecontent.com/drupal-7-end-of-life-why-wordpress-is-the-best-migration-option/" class="listtitle"&gt;Drupal 7 End of Life: Why WordPress is the Best Migration Option for Lower Maintenance Sites&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2024-12-17T14:25:15Z" title="17 December 2024"&gt;17 December 2024&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;

                        &lt;p class="card-text flex-grow-1"&gt;Drupal 7 support ends January 2025. Discover why WordPress is the cost-effective, user-friendly CMS for small agencies, freelancers, and businesses.&lt;/p&gt;

                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

      &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/"&gt;
                        &lt;img src="https://anothercoffee.net/images/a-west-london-micro-agencys-journey-to-ai-featured.jpg" class="card-img-top" alt="Still Alive: A Micro Agency's 20 Year Journey"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/" class="listtitle"&gt;Still Alive: A Micro Agency's 20 Year Journey&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2024-10-15T15:28:15Z" title="15 October 2024"&gt;15 October 2024&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;

                        &lt;p class="card-text flex-grow-1"&gt;This article will be the first in a series where I'll share how Artificial Intelligence has reshaped how we operate at Another Cup of Coffee.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

        &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://migratecontent.com/drupal-7-docker-containers-migration-projects/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/Drupal-Docker-Containers-card-300-150.jpg" class="card-img-top" alt="How To Set Up Drupal 7 Docker Containers for Migration Projects"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://migratecontent.com/drupal-7-docker-containers-migration-projects/" class="listtitle"&gt;How To Set Up Drupal 7 Docker Containers for Migration Projects&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2024-09-09T13:25:15Z" title="09 September 2024"&gt;09 September 2024&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;

                        &lt;p class="card-text flex-grow-1"&gt;Learn how Docker is a valuable tool for Drupal 7 end of life migrations. In this post, I'll give a step-by-step guide to setting up a Drupal 7 container for your migration project.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;

&lt;/section&gt;</description><category>AI</category><category>Confidentiality</category><category>LLM</category><category>Operations</category><category>Privacy</category><category>Security</category><category>Workflow</category><guid>https://anothercoffee.net/secure-your-ai-workflow-using-local-tokenisation/</guid><pubDate>Tue, 12 Nov 2024 13:59:03 GMT</pubDate></item><item><title>Still Alive: A Micro Agency's 20 Year Journey</title><link>https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/</link><dc:creator>Anthony Lopez-Vito</dc:creator><description>&lt;figure&gt;&lt;img src="https://anothercoffee.net/images/a-west-london-micro-agencys-journey-to-ai-og-1200x630.jpg"&gt;&lt;/figure&gt; &lt;p&gt;Recently I was handing over tasks to an AI when it struck me how much my work has changed. It's been two years since &lt;a href="https://openai.com/index/chatgpt/" target="_blank" rel="nofollow noopener noreferrer"&gt;OpenAI released&lt;/a&gt; ChatGPT-3.5 but things were so different when I first started Another Cup of Coffee. One of my biggest problems, almost twenty years ago, was figuring out how I'd hire skilled people with no budget. Now I'm learning to understand how to use something that's been trained on all the world's knowledge, while paying less than the price of lunch.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;Another Cup of Coffee has never positioned itself as a trailblazer but over the past two decades, our approach has positioned us to adapt to industry shifts while ensuring that our clients remain happy.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;This change is astounding so I thought to reflect on our journey. I'd like to tell the story of building Another Cup of Coffee from a one-person operation to a survivor of multiple technological revolutions. It's a story about the decisions, both good and bad, that led to where we are now. Most of all, it's about learning to adapt before a shift in paradigm.&lt;/p&gt;
&lt;p&gt;I'd like to share this now because I believe the developments in Artificial Intelligence have brought us to another turning point. It's once again time to adjust to new ways of working. Before I get into that, I think it's important to look back on the past to understand how we got here.&lt;/p&gt;
&lt;h2 id="finding-our-path"&gt;Finding Our Path&lt;/h2&gt;
&lt;p&gt;Another Cup of Coffee has never positioned itself as a trailblazer. It was founded on the principles of reliability, technical expertise and long-term client support. We focus on the behind-the-scenes work so that our &lt;a href="https://anothercoffee.net/about/" title="About Another Cup of Coffee"&gt;clients can shine&lt;/a&gt;. And while we don't try to be the first to adopt every new trend, we have found there's value in selecting tools and methods that make our services better.&lt;/p&gt;
&lt;p&gt;Over the past two decades, our approach has positioned us to adapt to industry shifts while ensuring that our clients remain happy with the core of what we do for them. This philosophy was a counterpoint to the early days of the web which was an experimental and somewhat chaotic scramble to try new technologies.&lt;/p&gt;
&lt;h2 id="the-early-web"&gt;The Early Web&lt;/h2&gt;
&lt;p&gt;The web was very different in the '90s. I remember moving out of the comfortable walled garden of CompuServe into the wild web—and being completely underwhelmed. It was &lt;em&gt;slow&lt;/em&gt;, even after upgrading to a blazingly fast 28.8K modem.&lt;/p&gt;
&lt;p&gt;Browsing sites on NCSA Mosaic was confusing too but we netizens soon understood the Net to be a place without rules. And if you wanted something, you had to build it, so we did. We hand coded sites in Notepad, Vi, FrontPage or HotDog. (Some of you must &lt;a href="https://www.webdesignmuseum.org/software/hotdog-1-0-in-1995" target="_blank" rel="nofollow noopener noreferrer"&gt;remember HotDog&lt;/a&gt;, right?) We built and experimented and figured things out as we went along.&lt;/p&gt;
&lt;figure class="figure d-flex flex-column align-items-center"&gt;
&lt;img src="https://anothercoffee.net/images/posts/hotdog-1-0-12.png" alt="Screenshot of HotDog 1.0 web editor" title="Screenshot of HotDog 1.0 web editor" class="img-fluid" style="max-width: 100%; height: auto;"&gt;
&lt;figcaption class="figure-caption text-center mt-2"&gt;HotDog 1.0 web editor released in 1995&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;After university, I landed jobs working on mobile data synchronisation, building online communities, web databases, and rolling out the first mobile services. These concepts are taken for granted now but they were pioneering back then. People quickly integrated new technologies into their business and personal life, opening up different approaches for accomplishing everyday tasks.&lt;/p&gt;
&lt;h2 id="starting-out-as-a-freelancer"&gt;Starting Out as a Freelancer&lt;/h2&gt;
&lt;p&gt;It was exciting work but I liked doing things my own way and set out as a freelancer in the early 2000s. I wore whichever hat would get work: technology analyst, database developer, systems administrator, network engineer, and of course, 'webmaster' (how quaint). Remote work wasn't a thing yet, so legitimacy meant having a proper business address.&lt;/p&gt;
&lt;aside class="pullquote"&gt;
  &lt;blockquote class="blockquote text-center red p-0"&gt;
    &lt;p&gt;The early web was about experimenting and figuring things out as we went along. Eventually, technology matured but that didn't prevent some obvious business mistakes.&lt;/p&gt;
  &lt;/blockquote&gt;
&lt;/aside&gt;

&lt;p&gt;People would blag desks from friends' businesses or pay for expensive serviced office space. I chose the quick and easy route: a rented mailbox address in Gloucester Road, London. Just a block away from the tube station and a short walk from Kensington Palace, it sounded legit and swanky! But no, by necessity I was working from home and operating a virtual office before it was the norm.&lt;/p&gt;
&lt;h2 id="fresh-coffee-and-trendy-peers"&gt;Fresh Coffee and Trendy Peers&lt;/h2&gt;
&lt;p&gt;After a few years as a solo freelancer, I realised that having a team behind me was necessary to take on more ambitious projects. When Another Cup of Coffee was formally established in 2006, it was a natural continuation of those freelancer years but with a more structured agency approach. The aim was to offer fully-functional but reasonably priced websites to small businesses, using a set of ready-built tools and an emphasis on customer service.&lt;/p&gt;
&lt;p&gt;Many developers were still rolling their own custom solutions. They retained the do-it-yourself mindset of the early web and didn't realise that open source Content Management Systems had matured enough to offer production-ready web platforms. WordPress was still mostly a blogger tool so I chose Drupal because of its flexible content structure and growing module ecosystem. We offered complete web solutions at a fraction of the cost and development time needed by most other agencies.&lt;/p&gt;
&lt;h3 id="westbourne-studios"&gt;Westbourne Studios&lt;/h3&gt;
&lt;p&gt;I set up in a client's spare meeting room in the trendy Westbourne Studios, near London's famous Portobello Road Market. Westbourne Studios was a space popular with creatives, musicians and media professionals which, after hours, transformed into one of Notting Hill's most popular nightclubs.&lt;/p&gt;
&lt;p&gt;In exchange for free office space, I handled project management for the host company. We shared the unit with three other micro agencies in the cool creative media arena so as techs, we were somewhat out of place.&lt;/p&gt;
&lt;figure class="figure d-flex flex-column align-items-center"&gt;
&lt;img src="https://anothercoffee.net/images/posts/westbourne-studios-w1200.jpg" alt="Westbourne Studios in the early 2000s" title="Potato phone photos of Westbourne Studios in the early 2000s" class="img-fluid" style="max-width: 100%; height: auto;"&gt;
&lt;figcaption class="figure-caption text-center mt-2"&gt;Potato phone photos of Westbourne Studios in the early 2000s&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Our initial team was small but promising. My friend Lan handled technical support; Karl, a colleague I befriended from my first job out of university, took care of development; Pafsanias, a brilliant fresh graduate who responded to my Gumtree job ad, created the front-end eye candy; Benjor, another friend who ran a small design studio in the Philippines, was our remote designer. I focused on project and account management while trying to build up the business. Claire (not her real name) was in charge of sales. &lt;/p&gt;
&lt;h3 id="early-mistakes"&gt;Early Mistakes&lt;/h3&gt;
&lt;p&gt;We all gelled personally but I learned an expensive lesson with Claire. You see, she couldn't actually sell. Claire was a photographer looking for side work, and I thought her outgoing personality would make up for inexperience.&lt;/p&gt;
&lt;p&gt;I was wrong. You need the right type of person in sales and she wasn't that person. It was my mistake. Perhaps I was motivated by the need to fit in with the other creatives in the building but in any case, it was a bad hire on my part. I really should have driven sales but I wasn't a salesperson either.&lt;/p&gt;
&lt;p&gt;My mistake with sales was one of several that would lead to problems balancing growth and resources. In &lt;a href="https://anothercoffee.net/still-alive-part2-balance"&gt;Part 2&lt;/a&gt;, I recount the initial challenges we faced as an agency and how we survived by transforming the way we worked.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This is the first in a series of posts about our journey and how we're adapting to Artificial Intelligence in our lives. I do hope you'll follow along with me as I share what I've learned.&lt;/em&gt;&lt;/p&gt;
&lt;section class="mt-4 pt-4"&gt;
    &lt;h3&gt;Related posts&lt;/h3&gt;

    &lt;div class="container my-4 p-4 border bg-light text-center"&gt;
        &lt;h4 class="grid-heading text-center mb-3"&gt;Still Alive: A Micro Agency's 20 Year Journey - Part 2&lt;/h4&gt;
        &lt;p&gt;In &lt;a href="https://anothercoffee.net/still-alive-part2-balance/" title="Still Alive: A Micro Agency's 20 Year Journey - Part 2"&gt;Part 2&lt;/a&gt; of &lt;em&gt;Still Alive: A Micro Agency's 20 Year Journey&lt;/em&gt;, I recount the initial challenges we faced as an agency and how we survived by transforming the way we worked.&lt;/p&gt;

        &lt;div class="button-container mt-4 mb-2"&gt;
          &lt;a href="https://anothercoffee.net/still-alive-part2-balance/" class="btn btn-primary text-white mr-3"&gt;
            Read Part 2 Now
          &lt;/a&gt;
        &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="row"&gt;
        &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://migratecontent.com/drupal-7-end-of-life-why-wordpress-is-the-best-migration-option/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/drupal-7-end-of-life-why-wordpress-is-the-best-migration-option-300x150.jpg" class="card-img-top" alt="Drupal 7 End of Life: Why WordPress is the Best Migration Option for Lower Maintenance Sites"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://migratecontent.com/drupal-7-end-of-life-why-wordpress-is-the-best-migration-option/" class="listtitle"&gt;Drupal 7 End of Life: Why WordPress is the Best Migration Option for Lower Maintenance Sites&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2024-12-17T14:25:15Z" title="17 December 2024"&gt;17 December 2024&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;

                        &lt;p class="card-text flex-grow-1"&gt;Drupal 7 support ends January 2025. Discover why WordPress is the cost-effective, user-friendly CMS for small agencies, freelancers, and businesses.&lt;/p&gt;

                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;

        &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
          &lt;div class="card h-100"&gt;
              &lt;a href="https://anothercoffee.net/still-alive-part2-balance/"&gt;
                      &lt;img src="https://anothercoffee.net/images/a-west-london-micro-agencys-journey-p2-og-1200x630.jpg" class="card-img-top" alt="Still Alive: A Micro Agency's 20 Year Journey - Part 2"&gt;&lt;/a&gt;
              &lt;div class="card-body d-flex flex-column"&gt;
                  &lt;h4 class="card-title"&gt;&lt;a href="https://anothercoffee.net/still-alive-part2-balance/" class="listtitle"&gt;Still Alive: A Micro Agency's 20 Year Journey - Part 2&lt;/a&gt;&lt;/h4&gt;
                  &lt;div class="mb-2"&gt;
                      &lt;span&gt;&lt;time class="listdate" datetime="2024-11-14T15:28:15Z" title="14 November 2024"&gt;14 November 2024&lt;/time&gt;&lt;/span&gt;
                  &lt;/div&gt;

                      &lt;p class="card-text flex-grow-1"&gt;In Part 2 of 'Still Alive', I recount the initial challenges we faced as an agency and how we survived through transformation into a cloud-first, virtual operation specializing in content migrations.&lt;/p&gt;
              &lt;/div&gt;
          &lt;/div&gt;
      &lt;/div&gt;

        &lt;div class="col-md-6 col-lg-4 mb-4"&gt;
            &lt;div class="card h-100"&gt;
                &lt;a href="https://migratecontent.com/drupal-7-docker-containers-migration-projects/"&gt;
                        &lt;img src="https://anothercoffee.net/images/posts/Drupal-Docker-Containers-card-300-150.jpg" class="card-img-top" alt="How To Set Up Drupal 7 Docker Containers for Migration Projects"&gt;&lt;/a&gt;
                &lt;div class="card-body d-flex flex-column"&gt;
                    &lt;h4 class="card-title"&gt;&lt;a href="https://migratecontent.com/drupal-7-docker-containers-migration-projects/" class="listtitle"&gt;How To Set Up Drupal 7 Docker Containers for Migration Projects&lt;/a&gt;&lt;/h4&gt;
                    &lt;div class="mb-2"&gt;
                        &lt;span&gt;&lt;time class="listdate" datetime="2024-09-09T13:25:15Z" title="09 September 2024"&gt;09 September 2024&lt;/time&gt;&lt;/span&gt;
                    &lt;/div&gt;

                        &lt;p class="card-text flex-grow-1"&gt;Learn how Docker is a valuable tool for Drupal 7 end of life migrations. In this post, I'll give a step-by-step guide to setting up a Drupal 7 container for your migration project.&lt;/p&gt;
                &lt;/div&gt;
            &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;

&lt;/section&gt;

&lt;div class="mt-4 pt-4 text-muted small border-top border-bottom"&gt;
    &lt;h3 class="text-muted small"&gt;Footnotes&lt;/h3&gt;
    &lt;ul&gt;
      &lt;li&gt;Featured image photo by &lt;a href="https://unsplash.com/@emilianovittoriosi?utm_content=creditCopyText&amp;amp;utm_medium=referral&amp;amp;utm_source=unsplash" target="_blank" rel="nofollow noopener noreferrer"&gt;Emiliano Vittoriosi&lt;/a&gt;.&lt;/li&gt;
      &lt;li&gt;HotDog 1.0 screenshot from &lt;a href="https://www.webdesignmuseum.org/software/hotdog-1-0-in-1995" target="_blank" rel="nofollow noopener noreferrer"&gt;The Web design Museum&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
&lt;/div&gt;</description><category>About Us</category><category>Agency</category><category>AI</category><category>Business</category><category>LLM</category><category>Operations</category><category>Startups</category><category>Workflow</category><guid>https://anothercoffee.net/still-alive-a-micro-agencys-20-year-journey/</guid><pubDate>Tue, 15 Oct 2024 15:28:15 GMT</pubDate></item></channel></rss>