pull down to refresh

Last week I saw this paper from OpenAI called “Preventing URL-Based Data Exfiltration in Language-Model Agents”, which goes into detail on new mitigations they’ve added.



This is a great read. I like this transparency.

Initial Disclosure in 2023Initial Disclosure in 2023

Tackling the ProblemTackling the Problem

What Is the New MitigationWhat Is the New Mitigation

How Can It Be Bypassed?How Can It Be Bypassed?

Additional Mitigation IdeasAdditional Mitigation Ideas

Final Risk - Thorough Adoption of the MitigationFinal Risk - Thorough Adoption of the Mitigation

...read more at embracethered.com