Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think the benchmarks catch this very well. Opus 4.5 is _significantly_ better than Sonnet 4.5 in my experience, far more than the SWE Bench scores would say. I can happily leave Opus 4.5 running for 20-30 minutes and come back to very high quality software on complex tasks/refactoring. Sonnet 4.5 would fall over within a couple of minutes on these tasks.




What does "very high quality" mean here



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: