The conversation about usage-based billing usually starts with Stripe and ends with Stripe. That is fine. The Stripe Billing meter primitive is a real, working piece of infrastructure: you POST usage events, Stripe aggregates them into invoice line items at the end of the period, and the customer's card gets charged. As a piece of the pipeline, it is good.
The problem is that it is one piece of five. The other four are the work, and Stripe's docs are notably quiet about them. Makers find this out the hard way. A representative thread (r/SaaS 2026-04, "Stripe metered billing broke me", 34 comments) walked through the exact sequence: customer triggers an AI agent, the app fires reportUsage, the network blinks, the retry double-counts, the dashboard shows one number, the invoice shows another, the support ticket is opened.
This article is the wiring you need around Stripe Billing so the network blink does not become a customer dispute.
Piece 1: durable event capture with idempotency keys
Every metered event must be written to your own store before it is sent to Stripe. The store row carries an idempotency key the app generates (typically a UUID derived from the action that caused the event). On retry, the same idempotency key reaches the same row, the row is marked sent if Stripe acknowledged, and the duplicate POST is dropped.
If you only have one piece of the pipeline, this is the one. Without it, every retry is a potential double-charge and every customer dispute is unwinnable because you have no source of truth other than Stripe's view.
Piece 2: a late-arrival window
Events arrive late. Mobile clients reconnect after twenty minutes offline. Background jobs flush an hour after the action. Stripe's meter aggregation has a cutoff per period: once the invoice closes, late events for that period are silently dropped, which is great for finance and disastrous for the customer who paid for the action.
The working pattern: define a "billing window" that is shorter than Stripe's aggregation period. Events older than the window are routed to a manual review queue (or, more practically, a small one-off credit-or-charge flow). Events within the window are sent normally.
Piece 3: plan-change reconciliation
A customer upgrades mid-period. Stripe creates a proration. The meter primitive is plan-scoped, which means events you sent against the old plan ID need to be re-attributed (or zero-aggregated) against the new plan, depending on your model. This is where every roll-your-own pipeline drifts from invoice math.
The fix is to capture plan-change events explicitly (subscription.updated webhook) and emit reconciliation usage entries that zero-out the old plan's tail-of-period and re-add them to the new plan. Most makers skip this and discover it on the first month a serious customer upgrades.
Piece 4: refunds and credits as first-class events
"The customer's AI run failed, refund this one." The naive pattern is to issue a Stripe refund or a manual credit note. The correct pattern is to emit a negative usage event with a reason code. The usage ledger then tells the truth: 14 successful runs minus 1 failed = 13 billable.
If you cannot emit negatives, you have built a write-only meter and your support team is reconciling spreadsheets to invoices forever.
Piece 5: the customer-facing usage view
This is the underrated half of the pipeline. The customer needs to see, in your app, the same numbers Stripe will charge for. Same period, same total, same line items. If those numbers do not match what arrives in their email invoice, they will not trust you again. Mismatch is invariably caused by skipping piece 1 (idempotency) or piece 3 (plan-change reconciliation).
So when does "roll your own" stop being the right answer?
For one product, one meter, low volume: roll your own pipeline. You will learn the failure modes by being burned by them, and the integration with Stripe is small enough to maintain.
The moment you have two products or two meters, the pipeline complexity goes from "one well-tested function" to "a shared library of meter ingestion, dedupe, reconciliation, and reporting." That is when a purpose-built metering service starts paying for itself.
The category is real. UsageBox owns the meter, ingests events from any source, aggregates and pushes invoice line items into your Stripe (or alternative). Open-source options include Flexprice (Show HN, 2025-06) and Credyt (Show HN, 2026-01) which positions itself as "real-time billing built for AI." Picking one depends on whether you want managed (UsageBox), self-hosted toolkit (Flexprice), or AI-native primitives (Credyt). The point is that you do not have to build all five pieces of the pipeline yourself.
What I would actually do
- Start with Stripe Billing direct. Build pieces 1 and 5 (idempotent capture + customer-facing view) yourself for one meter, one product. Run it for two billing cycles.
- Add piece 3 (plan changes) the first time a customer upgrades. Until then, the code does not exist and that is fine.
- Move to a metering service the moment you add a second meter — that is the inflection where roll-your-own becomes a tax.
The honest framing: Stripe Billing is a meter primitive, not a billing pipeline. The pipeline is yours to build (or buy) and is what determines whether your usage-based pricing actually works.