pull down to refresh
7 sats \ 4 replies \ @optimism 12h \ parent \ on: My bad experiences using AI as a physicist science
Currently I'm pivoting my setup
from:
Synchronous: letting the LLM run unstructured with different models in a pipeline
to:
Asynchronous:
- LLM generates code or human writes it - doesn't matter - and uploads to repo
- Issue detection:
- linting logs one issue per error found
- If none found, LLM can analyze and create issue for the most significant issue - I specifically make the prompt with instruct repeatedly to only report the most significant issue. Works ok with LRM
- Users can of course add issues too, LLM analyzes if its a one-shot or if it needs breakdown
- Coding LLM can ingest issue and fix it with a pull req
- Pull req can get reviewed by LRM or human
- Human merges
Everything that can be done with code, like linting, does not use LLMs.
Damn, I just wanted the AI to check if my plant was alive and it built a greenhouse with a self-watering system and an AI-powered scarecrow.
I swear, sometimes ChatGPT doesn’t review code — it rewrites it like it's auditioning for a job at NASA. Like bro, I’m still trying to survive public static void main, not orchestrate microservices across a Kubernetes cluster.
Same thing with chemistry — I asked for a fresh ferrous sulphate recipe and got a mining operation flowchart straight outta a metallurgy PhD thesis. Asked my chem teacher and he just said “use Fe + H₂SO₄ and move on.”
It's like these LLMs read Thus Spoke Zarathustra and thought every answer must ascend the mountain of abstraction before descending to meet us mortals.
“He who climbs upon the highest mountains laughs at all tragedies, real or imagined." — Nietzsche (Clearly what GPT thinks before it answers a 4-mark question.)
But fr tho, loving that async pivot you're on @optimism. Turning LLMs from noisy sidekicks into focused bug-hunters with issue-detection filtering? That’s pretty GOOD
Don't worry I won't steal your repo, I'm building a Human Behaviour Prediction Engine too, https://github.com/axelvyrn/TiresiasIQ (and it's quite good, believe me - i'd like your input)
Also, curious: How are you ranking issue significance without it hallucinating a crisis over a missing semicolon?
reply
How are you ranking issue significance
It doesn't matter. Every task should be small, or otherwise needs breakdown.
without it hallucinating a crisis over a missing semicolon?
It's harder to make it "just fix a semicolon", so in that case using non-llm tools is better, or at least expose the tools needed to the LLM through MCP. Syntax fixing can be done with existing tools, so in this case you just expose an MCP tool, ie:
code_fixing::correct_semicolons(files[])
that implements the syntactical logic in code, without needing the LLM to actually write correct code.