pull down to refresh
Has the Mythos hype been proven to be rooted in reality?
Have any of the early-access users confirmed that Mythos is truly more than just incremental improvement?
It doesn't really matter. The danger has been for over a year:
- Security professionals that know what they are doing are able to move in hours instead of days/weeks
- Non-professionals may get lucky
Any increment in capabilities to any actor that is either a blackhat or an a-hole on a luck streak is an increment in danger.
At this moment with Mythos unreleased, the greatest immediate threat is that Opus 4.7 and GPT 5.5 got mainlined into subscription plans. Those are increments, yet if you ran your own adversarial analysis, you have to do it again, because the baseline changed. And pray your framework holds up (and no, plain installs of trail-of-bits skills are not holding up)
Fair enough.
- Non-professionals may get lucky
I probably have been interacting with too many early-career academics who fall into this category, without the actual luck factor, dismissing the actual benefits I've been experiencing in the code I've moved with a few well-targeted prompts.
Adversarially?
Oh no, not at all. It was an acknowledgment of my unfair instinctive habit of being dismissive of even incremental improvements, forgetting that in the big picture, these incremental improvements have been stacking up quite well into consequential improvements since the first version of ChatGPT came out. In the same line of thought, I should acknowledge that in the context of security, the latest LLMs are likely quite powerful.
to dismiss sloppy findings.
Yeah, that's what the early career students I work with are often unable to do
Because you need some pretty good sequence of prompts to get to validate/poc a real vuln
caveat noted
caveat noted
Caveat to the caveat is of course that you can ask the LLM to help you build the framework.
For example, I have a set of standard review instructions now for LLMs that do the preparative work I used to do manually when new releases of critical android software comes out (think: GrapheneOS itself, Signal, your LN wallet, pass/key management apps, your FOSS keyboard). I started with describing my manual process for different tech stacks (kotlin, dart+rust+bridges, react+bridges, etc) but I've also used "retro" rounds to improve specific instructions for each app, and there the LLM proposes me improvements to the prompts. Many are good (though I also get a lot of "this doesn't apply", which I edit out because it still needs to be checked in case it does apply in the future.)
Has the Mythos hype been proven to be rooted in reality?
Have any of the early-access users confirmed that Mythos is truly more than just incremental improvment?
Not the topic of this blogpost, but using Mythos as a dramatic comparison point requires more than just VC hype, especially if it is used as an argument to say QC is more than just hype.
Always interesting though to read Scott's insights.