>Opus 4.5 has really startled me - it genuinely can do complex software engineering tasks which I'd expect a proficient developer to take hours in minutes with very few defects.
I don't use Opus, but I use Sonnet 4.5 and ChatGpt 5.1, which are only a bit down the chart.
in my daily experience, these tools can help with many tasks - scaffolding crud, writing tests, explaining how this or that part of the code works, are three that come to mind.
But a mature piece of software has usually graduated to a point that it has numerous subsystems, layers and integrations that crosstalk with each other in often hacky ways. And my work is smack dab inth middle of that. Writing a feature or fixing a bug in that soup, I have found, is something that the best AIs will slow you down with as often as they speed you up.
And that doesn't even take into consideration that a very large part of my job is just defining what the bug or feature is, before I can even begin to code. And when I'm done with the coding, lets keep in mind the fun, time consuming processes known as "code review" and "deploy to customer"
I don't think the benchmarks catch this very well. Opus 4.5 is _significantly_ better than Sonnet 4.5 in my experience, far more than the SWE Bench scores would say. I can happily leave Opus 4.5 running for 20-30 minutes and come back to very high quality software on complex tasks/refactoring. Sonnet 4.5 would fall over within a couple of minutes on these tasks.
I don't use Opus, but I use Sonnet 4.5 and ChatGpt 5.1, which are only a bit down the chart.
in my daily experience, these tools can help with many tasks - scaffolding crud, writing tests, explaining how this or that part of the code works, are three that come to mind.
But a mature piece of software has usually graduated to a point that it has numerous subsystems, layers and integrations that crosstalk with each other in often hacky ways. And my work is smack dab inth middle of that. Writing a feature or fixing a bug in that soup, I have found, is something that the best AIs will slow you down with as often as they speed you up.
And that doesn't even take into consideration that a very large part of my job is just defining what the bug or feature is, before I can even begin to code. And when I'm done with the coding, lets keep in mind the fun, time consuming processes known as "code review" and "deploy to customer"