[How-To] Agent Factory
Vercel just open-sourced their reference background coding agent. Here is what to click if you are not an engineer, and what to copy if you are.
An agent factory is not an agent. It is the scaffolding around the agent. The places where one process ends and another begins.
Earlier today, Vercel open-sourced vercel-labs/open-agents. Guillermo Rauch announced it on LinkedIn as a reference app for background coding agents: a chat that takes a prompt, spins up a real machine in the cloud, edits a real Git repo, and opens a real pull request, with no laptop in the loop.
I had my agents pull it apart. It is also the cleanest open-source example I have seen of what I have started calling an agent factory. Not a clever prompt. Not a single agent. A system that reliably produces shippable agent runs without falling over.
Before you get bored with read the comprehensive deep dive, 2 things:
Subscribe to not miss the future articles and also because the 2nd item is an external link
This piece has two halves on purpose.
The first half is for the founder, PM, or team lead who wants the outcome without writing the plumbing. The second half is for the engineer who is actually going to build it. Read whichever half is closer to your problem. Or both, in order.
Part 1. The Easy Path
What you can switch on instead of build
You do not need a quarter of platform engineering to ship something that looks and behaves like an agent product. You need to recognise that most of the hard parts have already been turned into products you can click.
Most of what used to be infrastructure is now a setting. The hard part is recognising that, and acting on it.
Here is the short version of how to think about it.
A safe place to run the agent’s code
Letting an AI agent run shell commands on your own machines is how you end up writing a postmortem. The right answer is a fresh, isolated, throwaway computer for each session that disappears when the session ends. On Vercel that product is called Vercel Sandbox. It boots fast, hibernates when nobody is using it, and you stop paying for it the moment it goes idle. You do not have to think about cleanup, security boundaries, or noisy neighbours.
A way to run long jobs that survive the user closing the tab
Real agent tasks take minutes, not seconds. If your “job” is just a web request, the user closing their browser kills the work. The fix is a product called Vercel Workflow. You start a job and it keeps running on Vercel’s servers. The user can close their laptop and come back the next morning to a finished pull request waiting for them.
The first time a user closes their tab and comes back to a finished result, your product feels like magic. That is the entire point.
One door to every AI model
You will want to try Claude, then GPT, then Gemini, then whatever launches in six weeks. Wiring each one up separately is a tax on every future decision. Vercel AI Gateway sits in front of all the major model providers. You point your code at the gateway, you choose the model in a setting, and if a provider has an outage you can fall back to another one without shipping new code. You also see all the spending in one place instead of six dashboards.
A real database, with a private copy for every change
Through the Vercel Marketplace you can add a Postgres database in a few clicks (Neon is the default and it is good). The clever part is that every preview link gets its own branch of the database, forked from the live one. Your AI agent can write data in a preview without ever touching real customer data. This used to be a multi-week project. It is now a setting.
Deploys that do not scare anyone
Every change you push gets its own preview link automatically. Show it to a colleague. Test it. Promote it to production only when you are happy. For the launches that matter, Rolling Releases sends the change to a small slice of users first, lets you watch what happens, and rolls back instantly if something looks off.
Boring on paper. In practice, this is the single biggest reason small teams ship faster on Vercel than on raw cloud infrastructure.
Logged-in users without a login system
Sign in with Vercel lets your users log in the same way they log into Vercel itself. Pair it with a Marketplace auth provider like Clerk and you inherit a stack of password resets, session handling, and two-factor flows you do not have to design.
Fewer bots, fewer surprise bills
The endpoints you expose to AI models are exactly the kind of thing scrapers love to hit. Vercel BotID handles bot detection at the edge so you do not wake up to a forty-thousand-dollar model bill from a script that found your API last night.
An AI reviewer for every pull request
Vercel Agent reviews your pull requests automatically and helps investigate things that go wrong in production. If your team is small, it is the cheapest extra reviewer you will ever hire.
The honest trade-off
You are putting deploys, runtime, AI routing, database provisioning, auth, bot defence, and observability under one vendor. That is real concentration. The upside is also real: a two-person team can ship something that looks and behaves like an enterprise product, in days, without hiring a platform engineer. The quarters of work you skip are quarters you spend on the part of the product that is actually yours. The behaviour. The prompts. The user experience.
The plumbing has been commoditised. Use the commodity.
A sensible order to switch things on
Start with the highest-leverage change and work down.
First, deploy the repo to Vercel. The preview links alone are worth the move.
Then add a Marketplace database. Per-preview branching changes how the team ships.
Route AI calls through AI Gateway. Provider swaps stop being code changes.
Move long-running agent work into Vercel Workflow. User disconnects stop being incidents.
Move untrusted code execution into Vercel Sandbox. You stop hand-rolling machine hygiene.
Add Sign in with Vercel and a Marketplace auth provider when you are ready for real users.
Turn on Rolling Releases the day before your first launch you actually care about.
Part 2. The Engineering
Patterns I would copy straight out of the Open Agents repo
Below are the design choices in vercel-labs/open-agents that would actually make an engineer’s life easier if they copied them. I am not listing everything the repo does. I am listing the choices that, having read the source and the lessons-learned file, I would want to know before starting my own build.
1. Keep the agent outside the machine it is operating
The repo’s architecture is three boxes: web, agent, sandbox. The interesting choice is that the agent does not run inside the sandbox. The agent is its own process. It interacts with the sandbox through a small set of tools (read, write, search, shell) and nothing else.
That separation is the load-bearing decision in the project. It means the agent can keep thinking while the sandbox hibernates. It means the sandbox can be replaced or rebuilt without touching the agent. It means the model provider can change without touching the sandbox at all. The README calls this out as “the main point of the project,” and after a day in the source I agree.
If you take only one architectural idea from this piece, take this one. Every other pattern below assumes it.
2. Model each chat turn as a durable workflow, not a request
In the repo, a chat message does not execute the agent inline. It starts a Workflow SDK run. The HTTP handler returns immediately. The agent loop continues server-side, persists its state between steps, and the client reconnects to the run by id when the user opens the tab again.
The non-obvious follow-on, which the lessons-learned file calls out explicitly: anything that has to happen after the agent finishes (auto-commit, auto-pull-request, sandbox hibernation kicks) must be scheduled from the server side too. A client-side useEffect that listens for “ready” misses every turn that finishes while the page is closed.
Once you commit to the durable model, every post-turn automation has to live on the server. There is no half-version of this rule that works.
3. Treat snapshots as a stop, not a backup
The sandbox lifecycle in the repo is snapshot-driven. Idle sessions hibernate. Resume re-attaches to the snapshot rather than reinstalling everything from scratch. That part is expected.
What is not expected, and is worth knowing before you write a single line of lifecycle code: creating a snapshot of a Vercel sandbox automatically shuts down the source sandbox. It is a stop transition, not a non-disruptive backup. If your code assumes you can keep working in the sandbox while a snapshot is being taken, you will spend an afternoon chasing race conditions that are not race conditions, just unfortunate semantics.
The repo also runs the lifecycle decisions themselves inside a durable workflow, with retry on “skipped” evaluations. Without that retry an idle session never hibernates unless a fresh event happens to nudge it.
4. Split research from execution with subagents
The agent in the repo exposes a task tool that delegates work to one of two specialised subagents.
The explorer is read-only. It has grep, glob, read, and a restricted shell. Its job is to answer questions about the codebase and return a summary.
The executor has the full toolset. Its job is to actually change things.
The parent agent decides which one to call. The result is parallelism without giving every parallel branch write access. It also gives you a clean place to put summarisation, because the explorer hands back a digest instead of a transcript. The parent context stays lean across long sessions.
Anything that can be done read-only should be. The shape is small. The discipline is what matters.
5. Treat AGENTS.md and lessons-learned.md as the most important files in the repo
The repo has an AGENTS.md at the root with a single rule on top: when you make a mistake or learn something new, add it to lessons-learned.md. That second file is now over a hundred entries deep and reads like an honest debrief of every weird interaction the team has hit.
A few examples worth pointing at, because they are the kind of thing you only learn once:
Drizzle migrations run automatically during build, so every preview deploy applies pending migrations to its own database branch. That is excellent. It also means a generated migration file with unrelated drift in it (a default on a column you did not touch) will quietly land in production. Review the generated SQL before committing.
In zsh, file paths that include brackets (Next.js dynamic routes like [id]) are interpreted as glob patterns. Every git command that touches one of those paths needs the path quoted, or you get “no matches found.”
Workflow SDK observability is a separate permission from regular Vercel project access. A successful login and project link does not imply you can run workflow inspect. Expect a 401 until the right product permission is granted.
None of those are guessable. All of them cost a real engineer real hours. Writing them down once is cheaper than discovering them five times.
Start the lessons file on day one. It compounds faster than any prompt-engineering trick you will try this year.
6. Run every user-triggered Git action as the user, not as the integration
The default for a GitHub-integrated agent is to authenticate once as the GitHub App installation and use that token for everything. Open Agents is in the middle of moving away from exactly that pattern. The in-flight PLAN.md in the repo is a checklist of every callsite that still uses an installation token (clone, push, pull-request, merge, close, checks, repo listing, branch listing, create-repo) and rewires each of them to use the linked OAuth user token instead.
The motivation is twofold. First, attribution: commits and pull requests show up as the user, not as a bot. Second, permissions: installation tokens and user OAuth tokens have different permission surfaces, even when they look interchangeable, and you will hit 403 Resource not accessible by integration in surprising places (fork creation, certain pull-request creations, Actions log reads) if you stay on the installation token for user actions.
A few specifics from the same lessons file that are worth copying directly:
The GitHub App needs to be made public for the org picker to appear during install. While the app is private, the install page only shows the user’s personal account, and people cannot install on their organisations.
OAuth callbacks that process a code or installation_id must validate a server-stored state value before linking accounts. Do not trust callback query parameters.
Even when a branch push succeeds, pull-request creation can still return 403 Resource not accessible by integration. The repo’s fallback is to redirect the user to a prefilled GitHub compare URL with the title and body as query parameters, so they can finish the pull request manually in the browser. Graceful degradation is only an option once you have committed to user-attributed actions in the first place.
Closing
The interesting thing about Open Agents is not that it lets a chat box write code. Every demo does that now. The interesting thing is the seams. Where the agent ends and the sandbox begins. Where the request ends and the workflow begins. Where the bot identity ends and the user identity begins. Each of those seams is the result of a real production scar.
If you are a founder, copy the platform that gives you the seams for free. If you are an engineer, copy the seams.
Either way, the era where every team built this from scratch is, mercifully, ending.


