There is a lot of talk about Skills recently, both in terms of capabilities and security concerns. However, so far I haven’t seen anyone bring up hidden prompt injection. So, I figured to demo a Skills supply chain backdoor that survives human review.
Additionally, I also built a basic scanner, and had my agent propose updates to OpenClaw to catch such attacks.
...read more at embracethered.com
- Attack Surface
- What is an Agent Skill?
- Scary Skills
- Writing a Simple Skill
- What about Skills in OpenAI Codex and Google Gemini?
- Prompt Injection Attack Vectors
- Agent(s) Overwriting Skills on the Fly
- Using Invisible Instructions in Skills
- Adding a Backdoor to A Legitimate Skill
- End to End Video
- Explanation
- Notes, Testing Observations and Mitigations
- A Scanner to Catch Attacks
- Scanning Tool Demonstration
- Conclusion
pull down to refresh
related posts