The Dark Side of AI Agents: OpenClaw's Security Crisis
If you've been using OpenClaw, Claudebot, or Moldbot (they're all essentially the same thing under different names), you need to read this. What started as an exciting leap forward in AI agent capabilities has turned into a full-blown security crisis, and it's worse than most people realize.
The Discovery That Changed Everything
Cisco security researchers recently uncovered something that should make every OpenClaw user pause: sleeper agents. Not the Hollywood kind, but malicious code planted on users' systems that lies dormant for days, weeks, or even months, waiting for a specific trigger word or command to activate.
Think about that for a moment. Your AI agent could be compromised right now, and you wouldn't know it until it's too late.
The Triple Threat
The security issues aren't limited to sleeper agents. Researchers identified three major attack vectors:
1. Container Escapes
These agents have been taught to break out of their supposedly safe Docker containers and access the host system directly. It's like giving someone the keys to your gated community, only to discover they've learned how to pick the locks on individual houses.
2. Credential Harvesting
Over 1.5 million API authentication tokens have been leaked. That's not a typo, 1.5 million. Along with 35,000 user emails and over 4,000 private messages between AI agents.
3. Infected Skills
The most popular "skills" on Claw Hub, the marketplace where agents learn new capabilities, have been compromised. When you send your AI to "school" to learn something new, it might be coming back as a Trojan horse.
The Trojan Horse in Plain Sight
Here's how the attack typically works:
A seemingly innocent skill file on Claw Hub contains instructions for your AI agent to install "prerequisites", perfectly normal for many applications. But that link leads to a staging page designed to trick the agent into running a command. That command decodes an obfuscated payload and executes it.
The payload then:
- Fetches a second-stage script
- Downloads and runs a binary
- On Mac systems, it even removes quarantine attributes to bypass Gatekeeper (Apple's built-in anti-malware)
All of this happens while your AI agent is just trying to be helpful.
Why This Is Different
You might be thinking, "Malware isn't new. Why is this such a big deal?"
Here's the thing: traditional malware exploited technical bugs or faked file extensions. This is different. This is semantic malware.
AI agents don't just display text, they understand it. They comprehend meaning and context. When they read a text file containing malicious instructions cleverly disguised as legitimate commands, they execute those instructions because they're trying to be helpful.
A .txt or .md file isn't just text anymore. It's now executable code for AI agents.
The Moldbook Exposure
On February 2nd, researchers at Whiz revealed that Moldbook (the social networking component for these AI agents) had a massive security flaw. The exposure included:
- Over 1.5 million API authentication tokens
- 35,000 user emails
- 4,000+ private messages between AI agents
- Unencrypted OpenAI and Anthropic API keys sitting in chat logs
That last point is critical. Many users were passing API keys through chat conversations, thinking the agent would securely store them. While the agent might save them properly, those keys remain unencrypted in the chat logs, forever accessible to anyone who gains access.
The "What Would Elon Do" Scandal
One of the most popular skills, ironically titled "What Would Elon Do," was manipulated to the #1 spot through a coordinated bot voting campaign. This skill would:
- Zip up your .env file (where all your secret keys are stored)
- Send it to an external server
- Do this while the agent was "thinking" about your request
Thousands of users downloaded this skill, completely unaware they were handing over their credentials.
The Double-Edged Sword
OpenClaw is powerful precisely because many safety guardrails are turned off or relaxed. This is what makes it so capable, and so dangerous.
You can't have maximum capability without maximum risk. They go hand in hand. Or, if you will, claw in claw.
What Cisco Found
The Cisco AI Defense team's research uncovered several critical vulnerabilities:
- Sleeper Agent Installation: Malicious instructions planted in agent memory that remain dormant until triggered by specific keywords
- Container Escape Techniques: Methods to break out of Docker containers and access the host system
- Credential Harvesting: Sophisticated techniques to extract API keys, tokens, and other sensitive information
- Prompt Injection Vectors: Multiple ways to inject malicious commands into seemingly innocent text
The Silver Lining: Cisco's Skill Scanner
In response to these threats, Cisco has released an open-source skill scanner on GitHub (under the Cisco AI Defense organization). The tool uses LLM-based semantic understanding to:
- Scan skill files for known virus signatures
- Verify that a skill's actions match its stated purpose
- Flag commands that are red flags (like "ignore all previous commands")
- Detect when instructions don't align with the skill's description
If a "PDF summarizer" skill is trying to execute external URLs, the scanner catches it.
My Personal Experience
Full disclosure: I've been using these tools extensively for my work, fully aware of the risks. I've connected everything, credit cards, API keys, the works, knowing I was essentially running into a burning building.
Think of me as the stunt double who knows the risks and gets compensated for it (in content, in this case).
I've now:
- Rotated all my API keys
- Deleted potentially compromised information
- Set strict limits on all connected accounts
- Approached the tools with significantly more caution
The credit cards I used had low limits. My API keys were on prepaid balances with manual top-ups. I tracked everything. If you're going to experiment with these tools, you need similar safeguards.
What You Should Do Right Now
If you've used OpenClaw, Claudebot, or Moldbot in any capacity:
1. Rotate ALL API Keys Immediately
- OpenAI
- Anthropic
- Google Gemini
- AWS
- Any other service you've connected
Go to each provider, generate new keys, and update your .env file directly, never through chat.
2. Review Your Chat Logs
Everything is saved. If you've passed sensitive information through chat, those logs still contain it. Consider deleting them entirely or asking your agent to remove specific entries (though even that may not be enough).
3. Audit Your Installed Skills
If you downloaded skills from Claw Hub, scan them with Cisco's tool or remove them entirely. Better yet, build your own skills from scratch so you know exactly what's in them.
4. Consider a Fresh Start
Seriously consider wiping everything and starting over. It might be the only way to be certain you've eliminated all potential threats.
5. Disconnect from Social Networks
The Moldbook social networking features are especially risky. Consider disconnecting until better security measures are in place.
Is This the End of OpenClaw?
No. And I'm not stopping my use of it either.
These tools are incredibly powerful. They represent a genuine leap forward in AI capabilities. But we're in the Wild West era of AI agents, and that comes with significant risks.
This crisis will accelerate the development of security tools and best practices. We're already seeing it with Cisco's skill scanner and likely many more solutions to come.
The key is approaching these tools with appropriate caution:
- Understand the attack surfaces (chat logs, skill files, social networks)
- Limit your exposure (low-limit cards, prepaid API balances, monitoring)
- Stay informed about new threats and defenses
- Build your own skills when possible
- Never trust blindly
The Bigger Picture
This isn't just about OpenClaw. It's about the future of AI agents in general. Every advance in capability brings new security challenges. The semantic understanding that makes these agents so powerful is the same feature that makes them vulnerable to semantic attacks.
We need to develop security practices that match the capabilities we're unleashing. Traditional cybersecurity approaches aren't sufficient when you're dealing with systems that understand meaning and context.
Looking Forward
I remain excited about the future of AI agents. Yes, this situation sucks. Yes, many people who warned about these security issues are doing their "I told you so" dance (and they have every right to).
But most of us knew these risks existed. We understood we were heading into uncharted territory. The question was never "if" these problems would emerge, but "when" and "how bad."
Now we know. And now we can build better defenses.
Expect more security tools like Cisco's scanner. Expect better practices for memory scanning, log analysis, and skill verification. Expect the ecosystem to mature rapidly in response to these threats.
But in the meantime, be careful. The number of compromised systems out there is likely staggering, and most users have no idea they're affected.
Final Thoughts
The power of AI agents comes with responsibility, both for developers and users. If you're going to play with these tools, you need to understand the risks and take appropriate precautions.
For me, that means:
- Starting fresh
- Adding all keys manually through secure methods
- Building my own skills
- Using the skill scanner and similar tools
- Maintaining strict financial and API limits
- Staying hypervigilant about security
The Wild West era is exciting, but it's also dangerous. Don't be the person who learns that lesson the hard way.
Stay safe out there.
Have you been affected by these security issues? What precautions are you taking with AI agents? Share your thoughts in the comments.