Claude Opus 4.6 + GPT-5.3 Codex: My Dual-Model Workflow, Directly Doubling Efficiency

2/13/2026
7 min read

Are you friends who stayed up all night waiting for the new models to be released doing okay?

Anthropic and OpenAI released their respective flagship models on the same day. My WeChat Moments feed is already flooded with various benchmark comparisons.

But I don't want to talk about benchmarks today.

I want to talk about: What can you get out of this update?

Dual-Model Collaborative WorkflowšŸ”„ First, a chilling detail

There's a sentence in OpenAI's official blog:

"GPT-5.3-Codex is our first model to play a significant role in its own creation."

What does that mean?

During the development of GPT-5.3, the OpenAI team used early versions of Codex to debug the training process, manage deployments, and analyze test results.

They themselves said they were "amazed by the extent to which Codex can accelerate its own development."

AI is starting to participate in its own development.

This reminds me of the Moutai article I wrote before. At the time, I said, "Who cares if it will die in the future, if it can be used now, use it first."

Now I want to say: The speed of AI evolution may be faster than we think.

By the time you "figure it out" and take action, it may be too late.

šŸ’œ Claude Opus 4.6: Not Just Smarter, But Actually Able to Help You Work

Anthropic's official blog title for this update is very interesting: "Advancing finance with Claude Opus 4.6".

They have specifically optimized it for the financial industry. But don't rush to scroll away, these capabilities are also useful to us.

šŸ”§ Cowork: Finally Able to Directly Manipulate Local Files

This is the feature I've been most looking forward to.

Before, when using Claude, you had to copy and paste the file content.

Now with Cowork, you can let Claude directly access a folder on your computer, it can read, edit, and even create new files.

Imagine: You give it the folder of design drafts, and let it help you organize naming conventions, generate design documents, and even batch process images.

This is not the future, it's a feature that can be used now.

Cowork Feature

šŸ“Š Claude in Excel and PowerPoint

Anthropic has put Claude directly into the sidebar of Excel and PowerPoint.

The CTO of Hebbia said: "It used to take hours to make a financial PPT, now it only takes a few minutes."

This is simply a godsend for those of us who do product presentations and design proposals.

The co-founder of Shortcut AI said: "The performance leap of Opus 4.6 is almost unbelievable. Tasks that Opus 4.5 found difficult are now suddenly easy."

šŸ“ˆ Benchmarks in the Financial Field

Official data:

Finance Agent Evaluation: 60.7% (5.47% higher than Opus 4.5)

TaxEval: 76.0%

Real-World Finance Evaluation: 23 percentage points higher than Sonnet 4.5

What do these numbers mean? Claude is indeed stronger in handling complex tasks that require multi-step reasoning.

šŸ’š GPT-5.3 Codex: The Interaction Method Has Changed, That's the Key

⚔ 25% Speed Increase, But More Importantly, the Interaction Method Has Changed

Before, when using Codex, you had to wait for it to finish running to see the results. Want to change direction? Stop and start over.

It's different now.

GPT-5.3 Codex will report progress while working, and you can interrupt, ask questions, and adjust the direction at any time without losing context.

This interaction method is more like collaborating with a real colleague.

Interaction Method Comparison

🌐 Improved Website Development Capabilities

The official showed an example: Let GPT-5.3 Codex and GPT-5.2 Codex each make a SaaS landing page.

GPT-5.3 version:

āœ… Automatically converts the annual price into an average monthly price to make the discount more intuitive

āœ… Created an automatically rotating user review component with three different reviews

āœ… Overall, it's more like a product that can be launched directly

The GPT-5.2 version is relatively simple and requires more manual adjustments.

This improved ability to "understand user intent" is very practical for those of us who make prototypes and demos.

šŸ” Cybersecurity: The First Model Marked as "High Capability"

Many people didn't notice this information.

GPT-5.3 Codex is the first OpenAI model to be classified as "High capability" in cybersecurity tasks.

They have specifically launched the Trusted Access for Cyber program and promised to invest $10 million in API credits to support cybersecurity defense research.

The capability boundary of AI is expanding rapidly.

šŸ“Š Benchmark Comparison: Each Has Its Own Strengths

Terminal-Bench 2.0 (Terminal Programming Ability)

GPT-5.3 Codex: 77.3% GPT-5.2 Codex: 64.0% šŸ“ˆ Improvement: 13.3 percentage points

OSWorld-Verified (Computer Operation Ability)

GPT-5.3 Codex: 64.7% GPT-5.2 Codex: 38.2% šŸ“ˆ Improvement: 26.5 percentage points

Overall, GPT-5.3 Codex has greatly improved in terminal operation and computer usage capabilities.

Benchmark Comparison

šŸŗ Moutai Logic: What Can You Get Before They "Die"?

I wrote an article some time ago, using Duan Yongping's logic of buying Moutai to explain the value of AI tools.

The core point is:

Will Copilot be replaced? Maybe.

Does Cursor have a moat? Not really deep.

Is Claude Code the ultimate form? Definitely not.

But none of this matters.

What matters is: How much dividend can you get from them before they "die"?

Now Claude Opus 4.6 and GPT-5.3 Codex are here, the same question:

Will these two models be replaced? Definitely.

Are they the ultimate form of AI? Of course not.

But what about before they are replaced?

Some people will use Cowork to increase the efficiency of design documents by 10 times.

Some people will use Claude in Excel to reduce data analysis time from one day to one hour.

Some people will use GPT-5.3 Codex's interactive collaboration to create a complete SaaS in a week.

And you? Are you still waiting for a "better tool" to come out?

šŸŽÆ My Recommendations

For design proposals, product presentations, and data analysis → Claude Opus 4.6

šŸ‘‰ Cowork + Excel/PowerPoint integration, more friendly to office scenarios

For prototype development, writing code, and debugging → GPT-5.3 Codex

šŸ‘‰ Strong terminal capabilities, good interactive experience, fast speed

Use both → This is my choice

šŸ‘‰ Claude for preliminary research and documentation, GPT for later development and debugging

There is also a practical consideration: GPT is more stable to use in China.

Selection Suggestions

šŸ’° Price

Claude Opus 4.6

Input: $5 / million tokens Output: $25 / million tokens Over 200,000 token context: $10 / $37.50

GPT-5.3 Codex

API price not yet announced Currently used through ChatGPT Plus/Pro subscription

✨ Final Thoughts

What impressed me most about this update was not how much the benchmarks improved, but how the way AI works is changing.

Claude is starting to directly manipulate your file system.

GPT is starting to talk to you while working.

AI is starting to participate in its own development.

A year ago, we were still discussing whether AI could write code.

Now, we are discussing whether AI can independently complete a project.

What about in another year?

I don't know the answer.

But I know one thing: Those who make money with AI tools don't not know that these tools will be replaced.

They just figured it out: Who cares if it will die in the future, if it can be used now, use it first.ē­‰ä½ ć€Œęƒ³ęø…ę„šć€ēš„ę—¶å€™ļ¼Œēŗ¢åˆ©å·²ē»č¢«ē“œåˆ†å®Œäŗ†ć€‚

Published in Technology

You Might Also Like