What are the three usage-based meter models?

Hard cap (plan includes N, blocked beyond), soft cap with overage (plan includes N, billed per unit beyond), and credit packs (prepaid pool that draws down). Each fits a specific cost structure.

When should I use a hard cap?

When per-unit cost is small but the customer would be surprised by overage (SMS, calls), when the product is a free tier with the cap as the upgrade trigger, or when you have a hard per-customer cost budget you cannot exceed.

When does soft cap with overage win?

When per-unit cost is predictable and small (AI prompts, API calls), the customer controls consumption, and the unit is explicable in one sentence. The required pairing is a customer-facing dashboard so the meter is visible.

When are credit packs the right answer?

When per-unit cost is expensive in absolute terms (AI agent runs at $0.50+), when you want zero credit risk (prepayment only), or when AI vendor costs can spike and you want customers locked in at the pre-purchase rate.

Three meter models compared: hard cap, soft overage, credit packs

Q: Can I mix meter models on one feature?

No. Customers cannot keep multiple models for one feature straight; support tickets land within the first billing cycle. Different features can have different models; one feature gets one model.

The Flexprice AMA on r/StartUpIndia (26 points, 38 comments) captured the question makers ask once: "What pricing model should I use for usage-based?" The answer is rarely "the one that works in all cases" — three meter models cover almost every shape, and each one fits a specific cost structure. This article walks through them with real numbers.

Model 1: Hard cap

How it works. The plan includes N units per period. Beyond N, the customer is blocked from consuming more until they upgrade or the period resets.

When it fits:

Per-unit cost is small but the unit is something the customer would be surprised to overrun (SMS messages, outbound phone calls).
The product is a free tier with the cap as the upgrade trigger by design.
You as the operator have hard cost limits — e.g., a per-customer Twilio budget — and cannot tolerate unbounded usage.

Failure mode. The mid-task block. The customer is in the middle of doing the thing the product is for, hits the cap, gets stopped. Most quit before upgrading. Hard caps lose customers when applied to features the customer actively needs.

Concrete numbers. An SMS-sending product on a Twilio-pass-through cost of $0.007 per US message. Plan includes 500 messages at $9.99/mo. Hard cap at 500. The customer at message 501 sees "upgrade or wait until next month." Most upgrade. The product avoids a customer who would have sent 50,000 messages and bankrupted the plan.

Model 2: Soft cap with overage

How it works. The plan includes N units. Beyond N, every unit is billed at a published per-unit rate. Customer is never blocked.

When it fits:

Per-unit cost is small and predictable (AI prompts at $0.02 each, API calls at $0.001).
The customer is in control of consumption (they fire the action; it does not run in the background).
The unit is visible and explicable in one sentence.

Failure mode. The invisible meter. The customer cannot see what they are using; the first overage charge is a surprise; churn happens. The dashboard is mandatory; the meter alone is a trap. (See the dashboard article.)

Concrete numbers. An AI assistant product on $0.02-per-prompt actual cost. Plan: $19.99/mo with 200 prompts included, overage at $0.05/prompt. 90th-percentile customer uses 180 prompts, never overruns. Top 10% averages 400 prompts: pays $19.99 base + $10 overage = $29.99 effective. The product's gross margin per heavy customer holds; the product's experience for normal customers is unchanged.

Model 3: Credit packs (prepaid)

How it works. The customer buys a pack of N units upfront. Consumption draws down the pack. When the pack is empty, the customer must buy another or stop using the feature.

When it fits:

Per-unit cost is expensive in absolute terms (AI agent runs, transcription minutes, image generation).
You want zero credit risk — never charge after the fact, only on prepayment.
The unit is large enough that the customer thinks about each one (a 10-credit run is a decision).
Your AI vendor cost can spike (e.g., new model release) and you want the customer pre-purchasing rather than being billed at the new rate.

Failure mode. Friction at every reload. The customer is running their workflow, the pack empties, the workflow stops, the customer has to pull out a card. Auto-reload at threshold (when below 20% of pack, auto-buy more) is the standard fix.

Concrete numbers. An AI agent product on $0.50 actual cost per run. Sells 100-credit packs at $99 ($0.99 per credit, ~2x margin). 1-credit = 1 run. Customer buys a pack monthly on auto-reload. Heavy customer triggers an auto-reload at credit 80; pack refills to 100. The product has zero unpaid usage and predictable revenue.

The Credyt and Flexprice angle

Both Credyt and Flexprice position themselves around AI-specific meter primitives. Their pitch is roughly: Stripe Billing was designed for SaaS subscriptions, not AI per-run pricing, and the gap is structural. They are right enough that the category has space. For makers shipping AI features today, either of them (plus UsageBox, Metronome, and a few others) handles the meter primitive in a way that Stripe Billing alone does not.

The pattern is the same across all three vendors: model definition (pick hard cap / soft / credit pack), event ingestion (HTTP POST per metered action), aggregation (per customer per period), and customer-facing dashboard. The differentiation is in pricing, AI-specific primitives, and self-hosted-vs-managed.

Don't mix models on one feature

The most common failure: trying to be clever and shipping all three models for one feature. "You get 100 prompts hard-capped, plus 50 overage prompts, plus credit packs if you want more." Customers cannot keep this straight; support tickets land within the first billing cycle.

Pick one model per feature. If you have multiple features (AI prompts, file processing, exports), each one can have its own model — that is fine because the customer encounters them in separate parts of the product. Mixing models on one feature is where customers lose the thread.

The decision in three questions

Is per-unit cost small or large? Small: soft cap. Large: credit packs.
Can the customer tolerate being blocked mid-task? Yes (free tier, hard budgets): hard cap. No: soft cap or credit packs.
Is the unit visible and explicable? If no, none of the models will work; redesign the unit first.

What I would actually do

Start with soft cap with overage. The default that works for most features.
Add credit packs only for expensive AI features. Where each unit costs you $0.10+ and the customer should be deciding per-unit.
Use hard cap for the free tier. The cap is the upgrade trigger; that is fine.
Never mix models on one feature. Different features can have different models; one feature has one model.

The honest framing: three meter models cover almost every shape of usage-based product. Pick the one that fits the feature's cost structure and the customer's expectation. Picking the right model is more important than the choice of metering service.