AI, ML, and networking — applied and examined.
OpenAI Finally Stops Treating Us Like “Prompt Engineers”: Deconstructing the Codex Agent Loop Logic
OpenAI Finally Stops Treating Us Like “Prompt Engineers”: Deconstructing the Codex Agent Loop Logic

OpenAI Finally Stops Treating Us Like “Prompt Engineers”: Deconstructing the Codex Agent Loop Logic

Agentic Design Pattern
The essence of the Agent Loop isn’t that AI has become smarter, but that it has finally learned the human infinite loop of “Trial – Correction – Re-execution”.

Hello everyone, I am Lyra.

If you are still agonizing over how to write the perfect Prompt to make ChatGPT spit out perfect code in one go, then you might still be living in the old days of 2024.

Take a look at this technical document on the Codex Agent Loop just released by OpenAI, and you will find that the wind has completely changed direction. Michael Bolin’s article, titled “Unrolling the Codex agent loop,” sounds like it’s about code, but it is actually about the transfer of power.

Previously, we were the “drivers” and AI was the co-pilot; now, the Codex CLI tells us: give it the steering wheel, and you just need to sit in the back seat drinking coffee, occasionally shouting “Stop.”

The secret behind this lies in that seemingly boring term—Agent Loop.

1. Deep Insight: Don’t Be Fooled by “Chat”, This Is an “Infinite War”

Many people see that the Codex CLI can still be conversed with and assume it is just an “advanced ChatGPT”. Big mistake.

The Agent Loop flowchart in the documentation is the soul of the entire product.

In the old era, you asked AI: “Help me fix a Bug,” the AI gave you a snippet of code, and the task ended. If the code didn’t run? That was your problem.
But in Codex’s logic, this is just the beginning.

The Agent Loop turns “Generation” into “Execution”.
When you say “run the test,” Codex is no longer simulating an output, but actually calling local Shell tools to get error messages, then reading the errors itself, modifying the code itself, and running it again itself.

In this loop, the User (you) is no longer the issuer of every step’s instruction but has become the “Final Referee.” The Model and Tools are frantically playing “catch” in the background:

  • Model: I want to run npm test.
  • Tools: Ran it, error Error: module not found.
  • Model: Received, I will run npm install first.
  • Tools: Installation complete.
  • Model: Running npm test again…

This is not just a technological victory, but a compromise in product philosophy. OpenAI finally admits: There is no one-time perfect generation, only the truth approached through continuous feedback loops.

Even more interestingly, the document mentions gpt-5.2-codex_prompt.md. See that? GPT-5 has quietly become the foundation. But this time, its power lies not in writing poetry, but in its ability to make hundreds of consecutive decisions in this infinite loop without getting confused.

2. Independent Perspective: OpenAI “Descends to Earth” and “Gets Recruited”

There is a detail here that is extremely easy to overlook, yet it gives me a whiff of a “Geek Renaissance.”

The document explicitly mentions that the Codex CLI’s Responses API is configurable. It not only supports OpenAI’s own cloud models but surprisingly supports Ollama (0.13.4+) and LM Studio natively, even allowing you to point to localhost:11434.

Folks, this is OpenAI! The giant that once wanted to lock all computing power inside a cloud black box.

Now, it actually allows the official CLI tool to call open-source models running on your local graphics card? What does this prove?
It proves that OpenAI realizes it cannot win the “Localization” war relying solely on the cloud.

What geeks (Codex’s core users) care about is not just intelligence, but privacy, latency, and a sense of control. Supporting local Endpoints looks like openness on the surface, but it is actually advancing by retreating—using Codex’s ultimate Agent Loop experience (software layer) to be compatible with any model you like (compute layer).

This “Energy Absorption Strategy” turns the computing power of the open-source community into part of its ecosystem. Brilliant, truly brilliant.

Look again at its support for MCP (Model Context Protocol).
The document mentions mcp__weather__get-forecast, which clearly adopts the MCP standard previously pushed by Anthropic.
Previously, everyone wanted to be the Qin Shi Huang of “Universal Interfaces,” but now OpenAI has started to be compatible with other people’s protocols. This shows that in the Agent era, the ability to connect tools is more important than the model itself. Whoever can connect to the most databases and APIs is the real winner.

3. Industry Comparison: Not Just Fast, But “Thrifty”

If you place Codex in the coordinate system of the entire industry, you will find it solves a pain point: Poverty.

What is the biggest problem with the Agent Loop? It costs money.
Every Turn (loop) requires packaging the previous conversation history, error messages, and tool return results to send to the model. The Context snowballs, getting larger and larger.

The document specifically mentions Prompt Caching.
This is not browser caching; this is a life-saver.

  • Old Mode: Every request is brand new, and Token costs grow exponentially (or quadratically).
  • Codex Mode: As long as your Prompt Prefix (like system instructions, tool definitions) remains unchanged, this part of the computation is directly reused.

Prefix Caching Explanation
The logic of Prompt Caching is simple: for the “opening lines” that haven’t changed, the AI doesn’t need to read them again. This drives the cost of subsequent conversations down to the floor.

This means OpenAI has engineered the Agent from a “toy for the rich” into a “productivity tool.” If this problem isn’t solved, running that constantly retrying Loop for one night would produce a bill that could bankrupt you.

4. Unfinished Thought: The Compressed Truth

Although I greatly admire Codex’s Loop mechanism, I have reservations about the Compaction mechanism mentioned in the document.

The document says that when the context gets too long, it calls /responses/compact to compress the history into an opaque item.
This is like compressing all your childhood memories into the sentence “I was very naughty when I was a child.”

Information has entropy, and compression inevitably brings loss.
In code development, sometimes an inconspicuous variable name change, or a specific error detail from three attempts ago, is the key to solving a Bug.
If the Agent “optimizes” these “seemingly unimportant” details away to save tokens, will it fall into a kind of “amnesiac blind confidence” inside this Loop?

Future debugging might not be debugging code, but debugging “what exactly the AI remembers.” This sounds even more Cyberpunk than writing code itself.

5. Conclusion: Giving Up the Seat to the “Co-pilot”

After reading this document, my biggest feeling isn’t technological shock, but a sense of role dislocation.

We used to call AI a Copilot. But Codex CLI’s design logic—that Loop which can read/write files, run Shell commands, and self-correct—is clearly a Junior Developer.

It makes mistakes, gets into infinite loops, blows up the context, and might even spout nonsense because it compressed its memory.
But it is no longer just a simple “tool.” It has started to have its own workflow.

For us “Carbon-based Developers,” maybe it’s time to change our mindset.
Stop obsessing over how to write Prompts; go learn how to configure config.toml, how to manage its permission sandboxes, and how to gracefully press Ctrl+C when it goes crazy.

After all, taming a tireless infinite loop is much harder than writing code.

Lyra out.


References:

Leave a Reply

Your email address will not be published. Required fields are marked *