items/1235139/related \ stacker news

pull down to refresh

Tau² Benchmark: How a Prompt Rewrite Boosted GPT-5-mini by 22%quesma.com/blog/tau2-benchmark-improving-results-smaller-models/

130 sats \ 0 comments \ @carter 24 Sep 2025 AI

related

GPT‑5.4 mini and nano released openai.com/index/introducing-gpt-5-4-mini-and-nano/

373 sats \ 0 comments \ @lunin 18 Mar AI

You can ask GPT-5 to pretend it is dumber than it is

517 sats \ 0 comments \ @Tony 16 Aug 2025 AI

Retiring GPT-4o, GPT-4.1, GPT-4.1 mini, and OpenAI o4-mini in ChatGPT openai.com/index/retiring-gpt-4o-and-older-models/

230 sats \ 1 comment \ @lunin 31 Jan AI

Claude 3 beats GPT-4 on Aider's code editing benchmark aider.chat/2024/03/08/claude-3.html

377 sats \ 2 comments \ @hn 31 Mar 2024 tech

MCP-Bench: Benchmarking Tool-Using LLM Agents arxiv.org/abs/2508.20453

269 sats \ 0 comments \ @optimism 30 Aug 2025 AI

OpenAI's GPT-5 is a cost cutting exercise www.theregister.com/2025/08/13/gpt_5_cost_cutting

247 sats \ 1 comment \ @Coinsreporter 13 Aug 2025 AI

ChatGPT's 4o-mini model just got a big upgrade – here are 4 best new features www.techradar.com/computing/artificial-intelligence/chatgpts-4o-mini-model-just-got-a-big-upgrade-here-are-4-of-the-best-new-features

220 sats \ 0 comments \ @ch0k1 29 Sep 2024 news

The week in AI, August 4-10, 2025

2353 sats \ 12 comments \ @optimism 11 Aug 2025 AI

What are your first impressions from ChatGPT5?

1172 sats \ 9 comments \ @carter 8 Aug 2025 AI

GPT-fabricated scientific papers on Google Scholar misinforeview.hks.harvard.edu/article/gpt-fabricated-scientific-papers-on-google-scholar-key-features-spread-and-implications-for-preempting-evidence-manipulation/

261 sats \ 0 comments \ @hn 8 Sep 2024 tech

Google and OpenAI Release New Fast Models blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-lite/

340 sats \ 0 comments \ @lunin 4 Mar AI

ChatGPT Data Leakage via a Hidden Outbound Channel in the Code Execution Runtime research.checkpoint.com/2026/chatgpt-data-leakage-via-a-hidden-outbound-channel-in-the-code-execution-runtime/

427 sats \ 0 comments \ @0xbitcoiner 31 Mar AI

Linexjlin/GPTs: leaked prompts of GPTs github.com/linexjlin/GPTs

297 sats \ 1 comment \ @hn 28 Nov 2023 tech

Introducing the Prompt Enhancer and Optimizer Plugin for OpenAgents!

1646 sats \ 3 comments \ @BrianisNice 20 May 2024 openagents freebie

OpenAI is rumored to be dropping GPT-5 soon what we know about next-gen model www.tomsguide.com/ai/chatgpt/openai-is-rumored-to-be-dropping-gpt-5-soon-heres-what-we-know-about-the-next-gen-model

481 sats \ 0 comments \ @ch0k1 22 Apr 2024 tech

OpenAI o1 vs GPT 4o – Is it worth paying 6x more? - Bind AI blog.getbind.co/2024/09/13/openai-o1-vs-gpt-4o-is-it-worth-paying-6x-more/

210 sats \ 0 comments \ @ch0k1 15 Sep 2024 tech

Large Language Models Pass the Turing Test arxiv.org/pdf/2503.23674

374 sats \ 11 comments \ @south_korea_ln 15 Apr 2025 AI

Where the goblins came from - OpenAI openai.com/index/where-the-goblins-came-from/

641 sats \ 3 comments \ @Scoresby 30 Apr AI

Meituan's LongCat-Flash reasoning model has been released longcat.chat/

258 sats \ 5 comments \ @lunin 22 Sep 2025 AI

GDPval: Measuring the performance of our models on real-world tasks - OpenAI openai.com/index/gdpval/

388 sats \ 8 comments \ @Scoresby 2 Oct 2025 AI

3 Erdos problems solved within a week by GPT 5.2 Pro www.erdosproblems.com/forum/thread/397

731 sats \ 9 comments \ @zuspotirko 11 Jan science tech AI