Neither the original issue (having three models) nor this one (un consolidated payments) have anything to do with the end result / quality of the output.
Executing multiple agents on the same model also works.
I find it helpful to even change the persona of the same agent “the prompt” or the model the agent is using. These variations always help but I found having multiple different agents with different LLMs in the backend works better
I love where you're going with this. In my experience it's not about a different persona, it's about constantly considering context that triggers, different activations enhance a different outcome. You can achieve the same thing, of course by switching to an agent with a separate persona, but you can also get it simply by injecting new context, or forcing the agent to consider something new. I feel like this concept gets cargo-culted a little bit.
I personally have moved to a pattern where i use mastra-agents in my project to achieve this. I've slowly shifted the bulk of the code research and web research to my internal tools (built with small typescript agents).. I can now really easily bounce between different tools such as claude, codex, opencode and my coding tools are spending more time orchestrating work than doing the work themselves.
Thank you and I do like the mantra-agents concept as well and would love to explore adding something similar in the future such that you can quickly create subagents and assign tasks to them
That might be true but if you change the system instructions “which is at the beginning of the prompt” then caching doesn’t hit. So different agents would most likely skip caching unless the last prompt is different then you get the benefit of caching indeed
Can you comment on that?