Speaking from personal experience and talking to other users - the agents/harnesses of the vendors are just better and they are customized for their own models.
what kinds of tasks do you find this to be true for? For a while I was using claude code inside of the cursor terminal, but I found it to be basically the same as just using the same claude model in there.
Presumably the harness cant be doing THAT much differently right? Or rather what tasks are responsibilities of the harness could differentiate one harness from another harness
This becomes clearer for me with harder problems or long running tasks and sessions. Especially with larger context.
Examples that come to mind are how the context is filled up and how compaction works. Both Codex and Claude Code ship improvements regarding this specific to their own models and Iām not sure how this is reflected in tools like Cursor.