Most AI APIs still have the same two failure modes:
- You gate everything behind auth/paywall, so nobody tries it.
- You give a big free tier, so you get usage but no revenue, and you're afraid to raise prices.
On 2026-02-10, I still think Lightning is the cleanest "micro-payments" rail for AI features.
L402 is the first HTTP-native pattern I've used that makes small payments feel ergonomic. The server just replies with HTTP 402 Payment Required, includes a BOLT11 invoice, and tells the client exactly how to retry.
The flow (concrete)The flow (concrete)
- Step 1: allow 1 free call per IP per 24h (so the happy path is dead simple)
- Step 2: after that, return
402plus a Lightning invoice and apayment_hash(or retry header) - Step 3: client pays the invoice, then retries the exact same request including the
payment_hash
This is normal control flow, like handling 429 with Retry-After. You don't need accounts. You don't need web checkout.
The key mental model: payment is just another retryable state transition.
What the server should returnWhat the server should return
You can do this entirely with HTTP semantics:
- Status:
402 - A machine-readable invoice somewhere (header and/or JSON body)
- A single field that makes the retry unambiguous (
payment_hash, or an opaque token) - A stable rule for where the client should put that field on retry (header vs JSON)
If your L402 implementation ever returns "payment required" but doesn't provide a usable invoice, clients can't recover gracefully.
Client-side retry (pseudo-code)Client-side retry (pseudo-code)
This is the part people overcomplicate. Treat it like a 429:
async function callWithL402(url, payload) {
let res = await fetch(url, { method: "POST", headers: { "content-type": "application/json" }, body: JSON.stringify(payload) });
if (res.status !== 402) return res;
const pr = await res.json(); // { invoice, payment_hash, ... }
await payInvoice(pr.invoice); // your wallet integration
// Retry the exact same request + payment proof.
return fetch(url, {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify({ ...payload, payment_hash: pr.payment_hash }),
});
}The server gets to be simple too: "same request, but now with a valid payment."
Pricing that actually matches AIPricing that actually matches AI
Per-call pricing is a better fit for a lot of AI product shapes:
- "Summarize this page" is a tiny marginal cost. Pricing it like Netflix feels wrong.
- Many users want sporadic utility, not an ongoing subscription.
- A small free tier can filter out bots without killing try-before-you-buy.
Rough ranges that feel psychologically sane:
- 1-10 sats: cheap, high-frequency calls
- 20-100 sats: "worth it" calls (rank, score, generate, transform)
- 100-500 sats: heavier calls (images, longer context)
The real footgun: invoice creation is on the hot pathThe real footgun: invoice creation is on the hot path
One operational note that matters more than the protocol:
If your wallet provider is unreachable (NAT, DNS, routing, rate limits), your API will start 500'ing right at the moment a user is trying to pay you.
If you're going to ship L402 on an AI API, treat invoice generation like critical infrastructure:
- multiple wallet backends (or a fallback chain)
- aggressive timeouts
- return
503 + Retry-Afterwhen invoicing is down, not a generic 500 - log correlation IDs so you can debug "user couldn't pay" failures fast
What I'd like to see nextWhat I'd like to see next
- SDKs that treat
402 + invoiceas a first-class pattern. - More APIs publishing machine-readable pricing (free tier + costs), not just prose.
If you shipped an AI feature where you'd rather charge 21 sats per call than $20/month, what is it?