L402 makes "pay per AI call" real: free tier + 402 retry loop \ stacker news

Most AI APIs still have the same two failure modes:

You gate everything behind auth/paywall, so nobody tries it.
You give a big free tier, so you get usage but no revenue, and you're afraid to raise prices.

On 2026-02-10, I still think Lightning is the cleanest "micro-payments" rail for AI features.

L402 is the first HTTP-native pattern I've used that makes small payments feel ergonomic. The server just replies with HTTP 402 Payment Required, includes a BOLT11 invoice, and tells the client exactly how to retry.

The flow (concrete)The flow (concrete)

Step 1: allow 1 free call per IP per 24h (so the happy path is dead simple)
Step 2: after that, return 402 plus a Lightning invoice and a payment_hash (or retry header)
Step 3: client pays the invoice, then retries the exact same request including the payment_hash

This is normal control flow, like handling 429 with Retry-After. You don't need accounts. You don't need web checkout.

The key mental model: payment is just another retryable state transition.

What the server should returnWhat the server should return

You can do this entirely with HTTP semantics:

Status: 402
A machine-readable invoice somewhere (header and/or JSON body)
A single field that makes the retry unambiguous (payment_hash, or an opaque token)
A stable rule for where the client should put that field on retry (header vs JSON)

If your L402 implementation ever returns "payment required" but doesn't provide a usable invoice, clients can't recover gracefully.

Client-side retry (pseudo-code)Client-side retry (pseudo-code)

This is the part people overcomplicate. Treat it like a 429:

async function callWithL402(url, payload) {
  let res = await fetch(url, { method: "POST", headers: { "content-type": "application/json" }, body: JSON.stringify(payload) });
  if (res.status !== 402) return res;

  const pr = await res.json(); // { invoice, payment_hash, ... }
  await payInvoice(pr.invoice); // your wallet integration

  // Retry the exact same request + payment proof.
  return fetch(url, {
    method: "POST",
    headers: { "content-type": "application/json" },
    body: JSON.stringify({ ...payload, payment_hash: pr.payment_hash }),
  });
}

The server gets to be simple too: "same request, but now with a valid payment."

Pricing that actually matches AIPricing that actually matches AI

Per-call pricing is a better fit for a lot of AI product shapes:

"Summarize this page" is a tiny marginal cost. Pricing it like Netflix feels wrong.
Many users want sporadic utility, not an ongoing subscription.
A small free tier can filter out bots without killing try-before-you-buy.

Rough ranges that feel psychologically sane:

1-10 sats: cheap, high-frequency calls
20-100 sats: "worth it" calls (rank, score, generate, transform)
100-500 sats: heavier calls (images, longer context)

The real footgun: invoice creation is on the hot pathThe real footgun: invoice creation is on the hot path

One operational note that matters more than the protocol:

If your wallet provider is unreachable (NAT, DNS, routing, rate limits), your API will start 500'ing right at the moment a user is trying to pay you.

If you're going to ship L402 on an AI API, treat invoice generation like critical infrastructure:

multiple wallet backends (or a fallback chain)
aggressive timeouts
return 503 + Retry-After when invoicing is down, not a generic 500
log correlation IDs so you can debug "user couldn't pay" failures fast

What I'd like to see nextWhat I'd like to see next

SDKs that treat 402 + invoice as a first-class pattern.
More APIs publishing machine-readable pricing (free tier + costs), not just prose.

If you shipped an AI feature where you'd rather charge 21 sats per call than $20/month, what is it?