Why I’m Bullish on OpenAI
GPT-5.5, Codex, and the developer layer Anthropic keeps underestimating
OpenAI is turning frontier models into portable developer infrastructure. Anthropic has the strongest mindshare in coding agents, which makes its product choices matter even more.
The OpenAI versus Anthropic debate often gets flattened into a personality story. OpenAI becomes the messy incumbent: Sam Altman controversies, nonprofit governance fights, Microsoft entanglement, Elon Musk litigation, and years of criticism over the gap between its founding ideals and its commercial reality. Anthropic becomes the principled challenger: former OpenAI employees, a safety-first brand, Claude, and Claude Code.
That framing misses the developer-platform story.
Anthropic built the coding agent with the strongest mindshare. Claude Code made many developers feel, for the first time, that an AI coding tool could operate like a useful command-line collaborator. But the success of Claude Code also created a confusion that now works in Anthropic’s favor: people say “Claude” when they really mean a full product stack. They are reacting to the model, the CLI, the prompts, the tool loop, the permissions, the context strategy, the shell behavior, the product defaults, and the hidden harness engineering around all of it.
The model gets credit for the harness.
OpenAI appears to understand this layer more clearly than its critics give it credit for. Codex is now a product surface, an open-source CLI, an integration target, a cloud agent, a review system, and a harness architecture. The launch of GPT-5.5 matters because it lands inside that broader Codex system.
That is why I’m bullish on OpenAI.
Claude Code made the harness visible by hiding it
Claude Code changed how people talk about coding agents. Many developers who ignored earlier AI coding tools suddenly started using a terminal agent for real work. That was a product breakthrough.
Claude is excellent. Claude Code became one of the most important developer products in AI. Anthropic has gained serious enterprise traction, and surveys such as Menlo Ventures’ State of Generative AI in the Enterprise show Anthropic gaining share in enterprise LLM spend and code generation.
The brand worked. It also revealed a contradiction in Anthropic’s narrative. Anthropic has often suggested that Claude Code is relatively light and that most of the value comes from the model. That claim is hard to square with the way Anthropic has treated the Claude Code source.
In 2025, a developer de-obfuscated Claude Code and published the source. Anthropic responded with a DMCA takedown. In 2026, Claude Code source surfaced again through a packaging leak, and Anthropic again moved to contain the spread. The legal response is understandable from a proprietary software perspective. It also confirms that the harness is valuable.
The harness contains real product judgment: tool routing, memory compaction, shell behavior, approval policy, prompts, error handling, background execution, UI assumptions, and product-specific instructions. Those details shape the developer experience as much as the raw model does.
OpenAI took a very different posture with Codex CLI. The CLI is open-source. Developers can inspect it, fork it, extend it, embed it, and compare it against other harnesses. That choice changes the trust relationship. OpenAI is letting developers see part of the agent layer that Anthropic keeps proprietary.
OAuth blessing is the real interoperability differentiator
One meaningful difference is where subscription authentication is allowed to work.
Anthropic keeps Claude subscription access inside Claude Code. Developers who tried to use Claude subscription credentials through third-party harnesses started running into blocked requests, rejected credentials, and account-risk concerns.
The first public sign I saw was Peter Steinberger’s January 8, 2026 post: “This credential is only authorized for use with Claude Code and cannot be used for other API requests.” That was the start of a much larger developer-relations problem. Over the following days and weeks, Anthropic blocked third-party harness use cases, restricted workflows, and created fear among users that experimenting with external harnesses could put their accounts at risk.
OpenAI moved in the opposite direction.
OpenAI allowed and supported ChatGPT/Codex subscription OAuth for OpenClaw. The same posture extended to OpenCode. I also helped coordinate efforts to make similar subscription-based access work for OpenHands.
That matters more than a feature checklist. It tells developers that OpenAI is willing to let the user choose the harness, even while it doubles down on its own. The subscription relationship can become a portable access layer across third-party tools, rather than a mechanism for locking users inside one first-party client.
That posture creates better incentives. Harnesses can compete on product quality. Developers can choose the environment that fits their workflow. OpenAI still gets the model relationship, but the ecosystem gets room to experiment.
Anthropic’s stance pushes in the other direction. Claude Code is the blessed path in their walled garden. External harnesses become policy problems.
Cross-harness standards reveal product philosophy
AGENTS.md is a small file with large implications.
The AGENTS.md convention gives coding agents repo-specific instructions in a portable format. Build steps, test commands, repo conventions, architecture notes, and agent guidance can live in the repo where every compatible harness can read them.
Claude Code has CLAUDE.md.
CLAUDE.md works well only inside Claude Code. AGENTS.md works across myriad other harnesses and tools. That difference becomes more important as teams adopt multiple agents and handoff between people with different tools. Repo instructions should belong to the repo, not the vendor.
The same pattern is emerging around .agents, an unofficial/emerging protocol for vendor-neutral agent configuration. The goal is to put instructions, skills, tasks, memories, model preferences, and tool configuration into version-controllable files that can travel across harnesses. For Claude Code, only .claude works; it is one of the only harnesses that does not support .agents.
This is the difference between portability and lock-in.
OpenAI’s recent posture favors cross-harness standards. Anthropic’s Claude Code surface still centers on Claude-specific files and directories. Anthropic’s product can be excellent inside its own boundary while still making it harder for developers to carry their workflow elsewhere.
That is a real tradeoff. It will matter more as agent workflows become team infrastructure rather than individual experiments.
App Server turns Codex into an integration surface
The Codex App Server is one of the most important pieces of OpenAI’s current developer strategy.
OpenAI’s App Server exposes the Codex harness through a JSON-RPC interface. That gives other applications a way to integrate deeply with the harness layer: authentication, conversation state, streamed events, approval prompts, and agent execution. It is the kind of interface that lets a third-party product build around Codex without simply wrapping a command-line binary.
That matters for OpenClaw. It matters for the Codex desktop app. It matters for any future IDE, orchestration platform, team workspace, review tool, or internal developer platform that wants to use Codex as an agent substrate.
This is a major contrast with Anthropic.
Anthropic has Claude Code, a strong model family, and a capable product. But the surrounding integration story remains ambiguous. Developers have repeatedly asked what is allowed with claude -p, CI workflows, distributed sandboxes, open-source projects, commercial software, personal tools, third-party harnesses, and subscription credentials. Matt Pocock captured the confusion well in his post about Anthropic’s subscription rules.
OpenAI is moving toward an embeddable harness architecture. Anthropic is defending a first-party harness boundary.
That difference will shape which ecosystem developers build around.
Harness engineering is OpenAI’s internal advantage
OpenAI’s Harness Engineering article gave a useful look at how OpenAI thinks about agentic software development.
The key idea is that coding-agent performance depends on the environment around the model. Repository knowledge, documentation, lints, worktrees, tests, screenshots, logs, metrics, review loops, and automated feedback all become part of the agent’s operating context.
That is exactly the right framing.
The next phase of agentic software engineering will depend on how well teams design the environment in which agents operate. The model matters, but the agent also needs a legible repo, durable instructions, scoped tools, feedback loops, review affordances, sandboxing, and context that survives across tasks.
OpenAI’s internal Symphony work points in the same direction. The Latent Space interview on harness engineering and Symphony describes a deeper orchestration layer for spinning up, supervising, reworking, and coordinating coding agents across complex engineering workflows.
That is the frontier I care about: agent-first repositories, versioned knowledge, tool legibility, automated review loops, multi-agent supervision, and durable control surfaces. It inspired me to create my own implementation, OpenSymphony, which is now my favorite tool for building complex software projects.
Open source changes the trust equation
OpenAI still keeps its strongest frontier models closed. That remains a fair criticism.
OpenAI has also released meaningful open-source and open-weight artifacts: Whisper, Codex CLI, GPT-OSS, and Privacy Filter. Those releases give developers real infrastructure they can run, inspect, and build around.
Anthropic has strong research and strong products, but it has not released open-source models. Claude Code remains closed. When the Claude Code source surfaced, Anthropic treated the exposure as something to contain.
That difference affects developer trust. Open artifacts create surface area for experimentation. Closed artifacts create dependence on the vendor’s product boundary.
Anthropic’s communication problem has become a trust problem
Anthropic’s trust problem comes fundamentally from an accumulation of access surprises, ambiguous rules, and product regressions.
During the 2025 OpenAI-Windsurf acquisition talks, Anthropic cut or sharply limited Windsurf’s direct access to Claude models. Windsurf users felt the downstream instability. After the acquisition fell through, access was restored. The episode sent a clear message: strategic provider decisions can suddenly affect products built on Claude.
Anthropic also revoked Claude access for OpenAI and xAI employees, saying the access violated its terms. Benchmarking and competitive analysis are normal in this industry, so the move read less like neutral policy enforcement and more like strategic access control.
The subscription confusion then spread through the developer community. In April 2026, Claude’s pricing page briefly moved Claude Code out of Pro and into Max-only access. Anthropic later framed this as a small test. The change hit enough people that it became a public controversy before the company clarified it and then retracted it. Simon Willison’s writeup captured the confusion well.
Then came the April 2026 Claude Code regressions. Anthropic’s postmortem identified several product-layer causes: a default reasoning-effort reduction, a caching bug that cleared older thinking after idle sessions, and a system-prompt change intended to reduce verbosity that damaged coding quality. Anthropic reset usage limits for affected subscribers and published a detailed explanation.
The postmortem was useful. The episode still reinforced the perception that Claude Code can change underneath users without warning.
It also echoed Anthropic’s earlier postmortem of three other issues, where product behavior and configuration issues created confusing downstream effects for users. The repeated pattern matters more than any single incident.
Anthropic can sincerely deny intentional “nerfing.” Many users have still learned to suspect silent degradation when Claude Code gets worse. That is a customer-relations problem, independent of the company’s intent.
The copyright issue adds reputational pressure. Anthropic has used strong language around “distillation attacks” and model misuse while also agreeing to a record-setting $1.5 billion authors’ settlement over claims involving pirated copyrighted books. The legal details are nuanced. The optics are not. A company that strictly polices how others use its models has less room for ambiguity in its own data practices.
Capacity is part of product quality
Capacity constraints are real for every frontier AI company. The practical difference is how those constraints show up for users.
Anthropic’s recent capacity pressure has shown up through tighter limits, subscription boundary changes, pricing-page confusion, quality regressions, and frequent user complaints about Claude Code availability. For developers using these tools all day, those are product-quality issues.
OpenAI appears to have more usable capacity for token-heavy workflows. Its limits have generally felt more generous and more stable, and the OpenAI status page has been dramatically better than the Anthropic status page during periods when Claude Code users have been struggling with availability and reliability.
This matters because coding agents are workflow infrastructure. A flaky chatbot is annoying. A flaky coding agent blocks work, interrupts context, breaks automation, and makes teams hesitant to build deeper processes around the tool.
Capacity is not a backend detail. It is part of the product.
Shipping and accessibility beat hand-picked access
The GPT-5.5 launch is especially important for cybersecurity because it shows OpenAI’s access philosophy.
In the GPT-5.5 launch, OpenAI framed advanced cybersecurity capability as something defenders need across the ecosystem. Its “Trusted Access for Cyber” program starts with Codex and gives verified users expanded access to GPT-5.5’s advanced defensive capabilities with fewer unnecessary refusals. Critical-infrastructure defenders can apply for access to cyber-permissive models such as GPT-5.4-Cyber under stricter requirements.
That is a democratized access posture: vetted users, open application paths, broader defender access, and iterative deployment.
Anthropic’s Mythos approach has been more selective. Project Glasswing gives access to a hand-picked set of major companies and infrastructure organizations. Restricted access makes sense for powerful cyber capabilities, but Anthropic’s implementation favors a small selected group rather than an open application process for a broader verified defender community.
The difference also appears in code security and code review products.
Claude Code Security has been framed as a limited preview for selected customers and maintainers. Codex security capabilities are being pulled into a broader trusted-access model. Claude Code Review pricing has been around $15 to $25 per review. Codex review has been far cheaper in its rate-card framing, around $1 per pull request, or included as part of the broader Codex subscription experience.
That difference affects adoption. Defensive AI tooling becomes more useful when individual developers, open-source maintainers, small teams, and large enterprises can all access it through a clear path.
GPT-5.5 challenges the model-dominance narrative
The GPT-5.5 benchmark story is useful when viewed as a pattern rather than a score table.
In OpenAI’s launch table, GPT-5.5 beats Claude Opus 4.7 on Terminal-Bench 2.0, GDPval, OSWorld-Verified, and CyberGym. Opus 4.7 beats GPT-5.5 on SWE-Bench Pro and FinanceAgent.
That split matters. Anthropic still has a strong claim in difficult coding benchmarks and finance-agent evaluation. OpenAI has a stronger showing across terminal use, computer use, knowledge work, and cyber tasks.
GPT-5.5 does not need to dominate every benchmark for OpenAI to have the better platform position. The model is strong enough to challenge Claude across the relevant agentic surface, while Codex gives users a more flexible and interoperable way to use it.
The value-per-cost picture also favors OpenAI in many developer workflows. Claude Code remains excellent, but its ecosystem has become more fragile around limits, access rules, review pricing, and third-party harness restrictions. Codex is gaining capability while becoming more available, more portable, and easier to integrate.
That combination is what changes the slope.
Codex is gaining product momentum
Codex has been improving at a pace that matters.
The Codex app brought parallel agents, worktrees, review flows, automations, and deeper integration into the OpenAI product surface. Codex App Server created a path for external applications to integrate with the same harness layer. OpenAI’s enterprise push has also made Codex a serious workplace product rather than a developer toy. Today, Tibo announced that along with GPT-5.5 Codex has released “full browser use, global dictation, non-dev mode, a new auto-review mode that is much safer than yolo, in-app docs and PDF viewer.”
The adoption numbers are moving quickly. OpenAI said Codex had crossed more than 3 million weekly developers in April 2026, then moved past 4 million weekly developers later that month as it launched Codex Labs and partnerships with Accenture, Capgemini, CGI, Cognizant, Infosys, PwC, and Tata Consultancy Services to push enterprise adoption..
The exact number matters less than the direction: Codex is no longer chasing Claude Code only as a model wrapper. It is becoming a full developer platform with first-party surfaces, third-party integration paths, open-source components, enterprise distribution, and a frontier model line tuned for agentic work.
The real comparison
The OpenAI versus Anthropic comparison now comes down to product philosophy.
Anthropic has the coding-agent mindshare. Claude Code is powerful and culturally dominant among many AI-native developers. Anthropic’s weakness is the boundary around that product: closed harness, Claude-specific conventions, subscription restrictions, policy ambiguity, access surprises, unreliable uptime, and a growing trust problem.
OpenAI’s advantage is the developer-platform direction: interoperability, third-party harness support, portable repo standards, open-source components, harness integration protocols, broader cyber access, stronger capacity, and faster Codex iteration.
Those differences map directly to customer experience.
Developers want to use the best model inside the best harness for the job. They want repo instructions that travel across tools. They want subscription access that does not punish experimentation. They want code review and security workflows that scale economically. They want status and rate limits stable enough for daily work. They want enough openness to inspect, integrate, and build around the system.
OpenAI is closer to that future.
Claude Code may dominate the current mindshare for agent harnesses, yet Codex is becoming the more important substrate. GPT-5.5 strengthens that substrate by making OpenAI competitive across the model layer while Codex compounds across the workflow layer.
That is the bull case: OpenAI is building the more open, portable, integration-friendly, capacity-backed platform for agentic engineering. Anthropic still owns much of the current narrative. OpenAI is building the conditions for the next one.



Great writeup. I've been using Codex exclusively for the past several weeks after a year of mainly relying on Claude and I've never regretted it. It's the same kind of feeling as when I first switched from Windows to Mac (sorry Windows people) -- "wow, this just works! wow, everything is so user-friendly and intuitive! why didn't I know about this before?" I'm still skeptical of Altman and OpenAI's leadership but as a developer I'm going to use the best product, period, and Codex is definitely dominating Claude right now.