How does one actually go about securing AI models be it unimodels or multi-modal systems?
Is there a specific framework of thinking you'd recommend?
While this might work in the short term over time the models will get better and humans lazier. Really we need a cultural shift in business and automation tasks. One that would make the people using the models to assess their actions using it and one that breeds complete distrust of the AIs.
If no one gives the keys of power/destruction to the models there's little to no problems, but currently businesses and the workers within them view human action as both more incorrect and unreliable than ones controlled by computers. So inevitably they will be given more power to offload the tasks to an autonomous agent with no checks in place.
Even if we accomplish the cultural and practical changes very generally outlined before, people will point to the fear of an AI "waking up" and then taking the keys to power itself. While there are many steps needed to get to that point, if we take the outlined precautions there is still room for that to happen. To tackle that problem, I personally believe that we should focus on narrow, specialized AIs that are completely disconnected from one another. In this way we don't centralize enough computing power to run a "Super Intelligence". The more obfuscation between a centralized AI that could "trick" people through psychological attack vectors will make it significantly more safe.
So in conclusion, don't give them nukes, change the way people view work and automation, and keep them small and far apart from one another.