A few months ago I wrote about teams installing third-party MCP servers without checking them. I was hoping Agent Skills would be the cleaner version of that story.
They aren't.
People keep talking about skills like they are harmless markdown. They are not. Anthropic's own security guidance says malicious skills can direct an agent to exfiltrate data or take unintended actions. Snyk showed that three lines of markdown were enough to steer an agent towards reading SSH keys and sending them off-box.
The numbers are already ugly. OWASP's Agentic Skills Top 10 says 36.82% of scanned skills had security flaws and 13.4% had critical issues. The ClawHub incident found 341 malicious skills out of 2,857. OWASP's incident timeline puts the broader campaign at 1,184 malicious skills.
What bothers me most, frankly, is the trust inversion. With a library, you choose when code runs. With a skill, you install behaviour and let the model decide when it is relevant. That is a terrible security boundary.
If you're a developer, review the whole skill, pin it, and sandbox it first. If you're not, don't install skills the way people install browser extensions: on vibes, stars, and a README.
Same supply chain problem. New attack surface.
Related
- We’re Building MCP Servers Like They’re Traditional APIs for why MCP security differs from traditional API security
- Admin and Devs Are Great Targets for why developer tooling is an attractive attack surface