How are you balancing newcomer training with drive-by LLM solution filtering? \ stacker news ~AI

pull down to refresh

How are you balancing newcomer training with drive-by LLM solution filtering?

1034 sats \ 15 comments \ @k00b 25 Aug AI

Someone in a bitcoin founder group chat, who wouldn't mind me naming them but won't to demonstrate my capacity for prudence 💅, asked:

To those of you allowing open code contributions: How are you balancing newcomer training with drive-by LLM solution filtering? Some of these contributions serve as little more than a distraction.

My response which sort of evades the practical part of the question:

We incentivize contributions which should make it worse and we try to make it really easy to get started (clone and run command), yet real contributions out number slop ones by a good margin still. (We probably also benefit in having a frivolous product and the incentives make FOSS PRs part of regular dev life which might amortize any single distraction.)

I have a nearly positive take on ai contributions after getting quite a few: it’s a source of development diversity that a closed source team might struggle to find, ie maybe we are underutilizing AI dev tools internally and we might have gaps filled by cyborg contribs that’d otherwise be left unfilled.

That said we have one contributor that stretches ai way beyond their ability to self-review and it’s annoying af. Yet, again, they occasionally surface an issue (one in ten hit rate or so) that we hadn’t considered and make wading through all their slop worth it. They are annoying and unskilled but earnest. Human, cyborg, or machine, earnestness is where I draw the line between distraction and not.

My answer to the practical part might be that we don't really balance it and instead just lower priority based on historical slop and earnestness ratios.

Would you answer differently?

view all related items

177 sats \ 6 replies \ @optimism 26 Aug

Overall good answer, I think, though it really depends on what you're developing and with what language. I've accepted exactly 1 LLM contribution, but it was on a python repo, after the author cleaned up all the effing emojis.

More annoying to me still is people who feed issues to grok or gpt and then try to reap credit in issue comments. It's been happening more often since recent releases, and people apparently believe that maintainers don't notice the slop, or that they went zero to hero in 10 minutes using terminology they don't understand. People also get really upset if I say something about it, or when I even dare asking "did you test this?". I'm getting even less liked than I already was, but whatevs.

What I don't really see anymore in my own projects is this part:

newcomer training

I've had zero newcomers since April or so, where normally I'd have 2-3 a month.

344 sats \ 5 replies \ @SimpleStacker 26 Aug

it was surprising to me to see how many people submit PRs but say they haven't tested it. I try to test everything (as much as I can), even single line code changes, and wouldn't imagine submitting something for review without having done so.

144 sats \ 0 replies \ @optimism 26 Aug

Took me years to get rid of "LGTM" culture in reviews, which ultimately heightened review quality. Now the pressure is on the submitter because no one will want a fight when their review on an untested PR breaks shit

It sucks that FOSS development brings much stress and isn't always newcomer friendly but if you have a massive installed base you have to deliver quality work and avoidable debt on your main branch is going to hurt.

100 sats \ 3 replies \ @sox 26 Aug

I'm okay with AI contributions, to me the most important part is how your human brain arrived at the solution. I default to this.

how many people submit PRs but say they haven't tested it

But the problem is when they say they tested it when they didn't, which automatically invalidates any assumptions I could make about the usage of human logic. And I also get a little frustrated maybe because it seems that they don't value what they publish, while, like you, I get anxious even on single line code changes.

100 sats \ 2 replies \ @optimism 26 Aug

I get anxious even on single line code changes.

Unit tests often help with anxiety. "this was broken", see test. "now it's no longer broken", see test. "it has no negative impact on other features", see all the other tests.

100 sats \ 1 reply \ @sox 26 Aug

I gotta step up my tests game, I only make them temporarily and arbitrarily.

This is another lesson in discipline I have to give myself ngl

0 sats \ 0 replies \ @optimism 26 Aug

If you have decent coverage, it helps with spotting regressions in the future. The worst thing that ever happened imho on some code I wrote was that I wrote tests, then the tests got commented out by someone else and I missed that in a PR review. Then of course there was a regression and there was drama.

130 sats \ 0 replies \ @bounty_hunter 26 Aug

I think SN has a standout system for FOSS PRs that most other projects lack:

Disciplined maintenance of issue tracker: this takes a lot of time to do right, and OSS devs want to write code, not "be a PM", so the priorities of the project maintainer become illegible to outsiders looking to get involved.
Comprehensive dev environment: "works on my machine"-friction adds a lot of costs to collaboration. NB: i think this is what's holding back a lot of "drive by contributions", I dont think there's any AI coding system that can handle the docker environment without a lot of expert customization.
Consistent response time and courteousness: On other projects you'll often see a hostility to even earnest contributions, or it's a side project that got big while the maintainer has a full time job so PRs go un-reviewed for weeks or rejected for petty reasons.
Not too large, not too small rewards and tasks: ONE BIG REWARD type bounties (or hackathons) are good for marketing and attract lots of contributors but here, the median quality will be quite low.
The contributors are almost always users of the product: whereas for many other projects it's specialized software that each person uses differently in private on their own computer, e.g. MoviePy pakcage.

Overall this seems to hit the sweet spot of being welcoming and soliciting useful things. But the cost to duplicate these features for other projects also seems high.

Some other incentivized contributions I've come across that I think could be relatively AI-resistant are perf-based challenges like tinygrad bounties and kaggle competitions where a test-harness can be applied in a semi-automated to see if the contribution boosts accuracy or perf on some metric, and if so, then look into the code quality.

100 sats \ 0 replies \ @SimpleStacker 26 Aug

I don't know how I'd answer since I've never maintained a FOSS project.

But what stands out to me is that bots clearly aren't at the point yet where they can independently contribute to projects without human supervision.

90 sats \ 0 replies \ @sox 26 Aug

How are you balancing newcomer training with drive-by LLM solution filtering?

I like the difficulty system of the bounties. It's like starting a new game and progressing with levels, imo you can't do a medium (or an easy!) if you can't do your good-first-issues first.

In my first days it was invaluable to me how this progression made me understand the codebase. So I vote for difficulties, that was a really good newcomer training for me.

44 sats \ 0 replies \ @k00b OP 25 Aug

(Upon second reading, I'm realizing how important context is. High levels of self-deprecation and frustration with third parties are even more unseemly outside of close quarters.)