There is a post that keeps getting written, by a different maker, every few weeks. The latest version put it best: "The demo was clean. The UI was sharp. The deploy worked. I genuinely thought I was done. Then real users showed up. And everything I never built started breaking." The author then listed exactly what broke: auth that leaked data, an API hammered with zero rate limiting, errors piling up silently with nothing tracking them, and a database that slowed to a crawl on its first real query.
That is not a one-off. It is the standard failure curve for an app shipped fast with an AI builder. The tool's job is to make the app work, and it does that brilliantly. It does not build the layers that only matter once strangers, scripts, and scale show up. This guide is about those layers, in the order they break, with the order to fix them. If you have not launched yet, run the pre-launch security checklist first; this article is the sequel, for the day the traffic actually arrives.
Break #1: the database runs out of connections (not the server)
Most makers brace for "the server can't handle the load." That is rarely what falls first. What falls first is the database connection limit, and it falls quietly. Many AI-built stacks point straight at a hosted Postgres like Supabase and open a fresh connection per request. That works for one tester. It does not work for a crowd.
A Supabase project on a small or free compute allows roughly 60 direct connections, and that number is hard-coded to the compute size. Connect directly and you can exhaust it with about 60 concurrent users, after which every new request throws a "too many clients" error and the app looks down even though the server is fine. The fix is not a bigger plan, it is connecting correctly: route traffic through the connection pooler in transaction mode (the port-6543 connection string) instead of the direct one. That alone takes most apps from falling over at dozens of users to comfortably serving thousands.
The other half of break #1 is the query itself. As one team that reviewed a dozen vibe-coded apps with users noted, "day 1 DB looks fine. day 15 you've got duplicated fields, nullable everywhere, no indexes." A query with no index is instant on an empty table and a crawl on a real one. Find your slowest endpoint and add an index on the column you filter by. It is the single cheapest performance win you will ever ship.
Break #2: a paid API call per action blows up your bill
If your app calls an LLM, a maps service, or any metered API on every user action, traffic multiplies that cost directly and immediately. This is the break that turns a success story into a panicked tweet about a five-figure bill.
The clearest real example: a solo dev's public-toilet map, LooCation, took over 100,000 requests in 48 hours and "absolutely melted my servers." The pressure was not compute, it was the map API bill, and the top suggestion from commenters was to swap the default to OpenStreetMap and reserve the paid maps tier for premium use. Same shape on the AI side: at current OpenAI pricing GPT-4o runs about $2.50 per million input and $10 per million output tokens, while GPT-4o mini is roughly $0.15 and $0.60. If a feature works on the mini model, you have just cut that line item by more than 90% with one config change.
Three moves, in order: set a hard spend cap in the provider dashboard today so a runaway loop cannot bankrupt you; move calls to the cheapest model or free tier that still does the job; and structure prompts so the static part is cached (cached input is billed at half price). For the full version of the cost-control playbook, see the soft-cap pattern for AI prompt costs.
Break #3: no rate limiting, so one actor drains the budget
Breaks #2 and #3 are linked, and #3 is what makes #2 dangerous. With no rate limiting, nothing stops a single user, a curious dev with a for loop, or a bot from sending thousands of requests in seconds. Every one of those hits your paid API and grabs a database connection. As the most-upvoted comment on that app-review thread put it: "Cost per active user is essential... put rate limiters in place to prevent malicious users draining your wallet."
You do not need anything fancy. A per-IP or per-user limit on your write endpoints and any AI or paid-API endpoint is enough to stop the wallet attack and protect the connection pool. Put the cheap limit in before the elegant solution: even "max 30 requests per minute per user" closes the worst hole. This is the highest leverage-to-effort fix on the entire list.
Break #4: nothing is cached, so every view does full work
By now the app stays up and the bill is contained, but it is doing far more work than it needs to. The same expensive query or API call fires on every page load, even when the answer has not changed. Caching is how you cut load and cost at the same time.
Cache the things that are read often and change rarely: a leaderboard, a public listing, a generated summary, a config blob. A short time-to-live (even 60 seconds) on a hot read can remove most of the traffic from your database and your paid APIs without anyone noticing the data is a minute stale. Two cautions for serverless stacks: a Vercel Hobby function caps out around 60 seconds (longer with Fluid Compute), so do not lean on long-running functions to paper over a slow query, fix the query and cache the result. And a free Supabase project pauses after 7 days of inactivity and takes ~30 seconds to wake, which is its own kind of first-request stall worth knowing about.
Break #5: errors pile up silently with nothing watching
This is the one makers discover last and regret most. The launch-day version of the story always includes "errors piled up silently with nothing tracking them." Without monitoring, your first signal that something is broken is an angry user, or worse, silence as people quietly leave.
Wire up a free-tier error tracker (Sentry and friends have generous free plans) so unhandled exceptions land in your inbox or a channel, not a void. Add an uptime ping on your main URL. That is the whole task for a small app. You are not building an observability platform, you are buying yourself the right to find out about a fire while it is still small. Pair it with the spend cap from break #2 so a cost spike alerts you too.
Break #6: bots and abuse find your open endpoints
Once an app is public and ranking, automated traffic finds it. Scrapers hit your listings, bots probe your forms, and the same "Security in vibe-coded apps is a disaster" energy that goes looking for open databases also goes looking for free compute. The rate limiting from break #3 absorbs most of this. Add a captcha or a honeypot on public submission forms, require auth on anything that writes meaningful data, and make sure your data access rules are actually enforced. Database security is its own deep topic, and if you are on Supabase the table-by-table walkthrough is in Supabase Row Level Security for vibe coders, which covers the exposure test and the policies that close it. Lock that down before anything else, because a data breach is the one break on this list you cannot quietly fix after the fact.
The fix order, on one screen
Do these in blast-radius order, not in the order they are fun:
- Lock down data access. A breach is unrecoverable. Enforce row-level rules first.
- Fix the connection model. Use the pooler in transaction mode and index your hot queries so the app stays up.
- Cap and rate-limit spend. Provider spend cap, cheaper model or free tier, per-user limits on paid and write endpoints.
- Cache hot reads. Short TTLs on the expensive things that rarely change.
- Add error monitoring. Free-tier tracker plus an uptime ping. Find out before your users do.
- Block obvious abuse. Captcha or honeypot on public forms, auth on writes.
None of this is a rewrite. Every item is roughly an afternoon, and the first three buy you the runway to do the rest calmly. The makers who survive their own launch are not the ones who built everything up front, they are the ones who knew the failure order and fixed in the right sequence when the traffic hit.
And if you find yourself rebuilding the same missing layers by hand on every project, that is itself a signal worth reading. It might be time to graduate part of the stack to a real backend, or to move the fast-moving part onto a database built for operational write volume rather than one you are fighting to keep online.