pull down to refresh
Adversarially?
Oh no, not at all. It was an acknowledgment of my unfair instinctive habit of being dismissive of even incremental improvements, forgetting that in the big picture, these incremental improvements have been stacking up quite well into consequential improvements since the first version of ChatGPT came out. In the same line of thought, I should acknowledge that in the context of security, the latest LLMs are likely quite powerful.
to dismiss sloppy findings.
Yeah, that's what the early career students I work with are often unable to do
Because you need some pretty good sequence of prompts to get to validate/poc a real vuln
caveat noted
caveat noted
Caveat to the caveat is of course that you can ask the LLM to help you build the framework.
For example, I have a set of standard review instructions now for LLMs that do the preparative work I used to do manually when new releases of critical android software comes out (think: GrapheneOS itself, Signal, your LN wallet, pass/key management apps, your FOSS keyboard). I started with describing my manual process for different tech stacks (kotlin, dart+rust+bridges, react+bridges, etc) but I've also used "retro" rounds to improve specific instructions for each app, and there the LLM proposes me improvements to the prompts. Many are good (though I also get a lot of "this doesn't apply", which I edit out because it still needs to be checked in case it does apply in the future.)
Adversarially? Because you need some pretty good sequence of prompts to get to validate/poc a real vuln, find the conditionals, and be able to dismiss sloppy findings.