That's not the only stuff you need to manage. Having a system level sandbox is all about limiting the physical scope (the term physical in terms of interacting with the system using shell and syscalls) of stuff that the LLM agent could reach, but what about the logical scope that it could reach too, before you pass it to the physical scope? e.g. git branch/commit, npm run build, kubectl apply, or psql to run scripts that truncate your sql table or delete the database. Those are not easily controllable since they are concrete with contextual details.
Sure, but at least we can slow down that fat finger by adding safeguards and clean boundaries check, with a LLM agent things are automated at much higher pace, and more "fat fingers" can be done simultaneously, then it will have cascading effect that is beyond repairable. This is why we don't just need physical limitation, but also logical limitation as well.