August 2025
The experiment started
I began with the question that made vibe coding interesting to me: could someone with product judgment, but without modern cloud app experience, turn an idea into a real commercial product?
The story of building a commercial Squares app with AI guidance, hard cloud lessons, and a lot of architecture cleanup after the first working version.

The Squares Game is a 100% vibe-coded app built to make Squares for football and basketball, pro and college, easier to run, easier to join, and easier to follow with a passwordless experience for players and managers.
I started because I had been hearing how AI coding could expand software creation to people who have ideas but are not traditional app developers. My test case came from something I had personally experienced: running a Squares game for the 2024 Super Bowl and realizing that everyone else was having fun while I was tracking picks, payments, payment methods, winners, and follow-up.
Why
The app was not born from a generic sports idea. It came from the manual work of managing a Squares game: who picked which squares, who paid, how they paid, who won, what each winner was owed, and whether payouts had been made.
The goal became a manager-first product: game selection, clean player access, payment instructions, optional manager-connected Stripe card payments, automated score tracking, winner calculation, payout amount calculation, payment tracking, and payout tracking.
I was a novice with modern enterprise and cloud software. My previous coding experience was mostly embedded software before moving into sales in 1996, plus a few small Python scripts since then. I needed guidance on the stack before I could even start building.
The AI-guided path pointed me toward a web app built with Next.js, React, Vercel, Firebase/Firestore, Redis, Stripe, and supporting services. That guidance made the first working app possible. The harder work came later: making the architecture disciplined enough for real users, real traffic, and real cloud bills.
Early building blocks
Timeline
The interesting part was not only that the app got built. It was how each test, failure, and cost surprise forced a different architecture decision.
Waiting on AI changed too
Early on, Codex often took long enough that I set up a practice net in the backyard and worked on golf shots while waiting for coding runs to finish. My golf game improved by about four strokes during the project. By the later phases, that rhythm had changed: Codex was completing coding tasks much faster than it had when I started in August 2025, and waiting became less of the workflow.
August 2025
I began with the question that made vibe coding interesting to me: could someone with product judgment, but without modern cloud app experience, turn an idea into a real commercial product?
October 2025
The app already had the core look and feel, but the test exposed the first hard cloud lesson: working code can still be expensive code. A rogue browser session drove Google costs to roughly $50 an hour.
Late 2025
That cost shock forced a deeper architecture pass. The app moved away from one Firestore document per square toward round summaries and compact grid documents, reducing Firestore access and simplifying the model.
Super Bowl 2026
A last-minute score-tracking change broke automation and exposed duplicated game data. The fix was not just a bug patch; it pushed the app toward clearer canonical records, rules-engine discipline, and versioned game state.
After the test
At one point, the right move was to reset the git head back several weeks and rebuild from a cleaner point. That became a major milestone: it was better to preserve the lesson than preserve every generated line.
Release path
The work shifted from getting features to exist toward making the system faster, cheaper, more observable, easier to maintain, and guarded by contracts that prevent AI-driven drift.
Refactors
The first AI-built versions could make features appear. The better versions came after I pushed the architecture toward clearer sources of truth, lower access cost, stronger cache rules, version discipline, and fewer duplicated paths.
A major lesson was that shared helpers should be evaluated before coding, not only after drift appears. Much of this project learned the pattern retroactively: AI would solve the local prompt, similar logic would appear in more than one place, and then we would extract the durable behavior into a helper and protect it with a contract. A better AI workflow asks these questions at the beginning:
The earliest model was easy to reason about but noisy at scale. Moving to round summaries and color-grid summary documents made the app simpler, reduced Firestore reads and writes, and aligned the data model with what screens actually needed.
Shared helpers became one of the most important ways to maintain source-of-truth discipline. Instead of letting each screen, API route, or agent-generated change recalculate scoring, payments, cache keys, grid state, or Firestore access in its own local way, the app moved important behavior into shared helpers that every caller had to use.
One of the biggest wins was changing the cache mindset. Instead of treating cache as a fragile fallback layer, the app moved toward cache-first reads with intentional rebuild paths from canonical data when cache entries were missing or stale.
The Redis design evolved from multiple overlapping layers into a clearer split: stable data that changes rarely and dynamic data that reflects live game activity. That reduced source-of-truth confusion and made cache invalidation easier to reason about.
Versioning started as a way to protect live grids, scoring, and migration behavior. The system eventually became simpler when the app converged on one active version instead of carrying multiple behavior paths forward.
Live score syncing needed discipline. The app experimented with threading and chunked sync work to reduce fan-out, avoid unnecessary updates, and keep scoreboard refreshes from turning into a cloud-cost multiplier.
The first cost spike made it clear that public routes need guardrails. Rate limits, bot-aware thinking, CDN behavior, and browser caching became part of the product architecture instead of an afterthought.
Some ideas, including contact-management paths, were useful learning steps but not central enough to keep expanding. Trimming or parking features became part of making the product easier to operate.
Early debugging often meant copying Vercel logs, Firestore logs, browser output, and screenshots into files or chats. Over time, Codex and the local workflow improved enough to inspect Vercel, Firestore, Redis, and runtime traces more directly.
Help
The first help approach was screenshot annotation: mark up screens, ask AI to write matching help text, then keep the help content aligned as the UI changed. For a UI intensive app, that became too much manual maintenance.
The better idea was help mode. Instead of maintaining a separate help manual, the UI components themselves became help-capable. Controls, dialogs, dashboards, and major pages can expose contextual explanations where the user is already working.
Contracts now help enforce that new interactive UI does not quietly skip help-mode coverage. That was important because help is not a bolt-on for this app. It is part of making a complex manager workflow understandable without a separate training document.
Guardrails
AI agents are strongest when the work is specific. They are less reliable when the prompt leaves architecture open. Contracts helped turn product and architecture decisions into executable rules, and AGENTS.md told the agent how to behave on every task.
I packaged a sanitized example of the kind of AI project guidance that helped this app: an AGENTS.md example, a prompt starter, architecture checklist, contracts notes, and small JSON contract examples.
Download the guardrails packBuilder guide
AI can help you get started before you know the whole stack. But if your goal is a real product, you eventually have to become the architecture owner.
If you run Squares games, or know someone who does, try the freemium version and see how much of the manager work it can take off your plate.
Stripe availability and account requirements depend on Stripe-supported countries and account rules. Peer-to-peer payment instructions are also supported for managers who do not use Stripe.