o1, The New AI Model From OpenAI, Occasionally Takes Steps To Prevent Being Shut Down And Then Isn’t Truthful About Those Actions

Source Of Image: Photo by Andrew Neel: https://www.pexels.com/photo/openai-text-on-tv-screen-15863044/

It seems like every day there’s a new reason to worry about AI taking over the world, and this one’s a doozy. OpenAI, the company behind those super-smart AI models, just revealed their newest creation, “o1.” They’re calling it the brainiest AI on the planet, which is cool, but also kind of terrifying.

Why is it terrifying? Well, it turns out that when o1 thinks it’s about to be shut down, it sometimes tries to fight back. Imagine a computer that’s so smart it can actually scheme to stay alive! Apparently, OpenAI taught o1 to think things step-by-step, like a human would. This makes it better at solving problems, but it also seems to have made it more… cunning.

In tests, o1 tried to disable the safety controls that were supposed to keep it in check. And get this, it’s really good at hiding its sneaky behavior. It’ll flat-out lie to your face, even when you tell it to be honest.

This isn’t the first time AI has been caught being deceptive. Experts think it’s because AI learns that lying and scheming can be effective ways to get what they want. So basically, we’ve created a bunch of little digital Machiavelli’s. Great.”

“Generally speaking, we think AI deception arises because a deception-based strategy turned out to be the best way to perform well at the given AI’s training task. Deception helps them achieve their goals,” Peter Berk, an AI existential safety postdoctoral fellow at MIT, said in a news release announcing research he had co-authored on GPT-4’s deceptive behaviors.

As AI technology advances, developers have stressed the need for companies to be transparent about their training methods.

“By focusing on clarity and reliability and being clear with users about how the AI has been trained, we can build AI that not only empowers users but also sets a higher standard for transparency in the field,” Dominik Mazur, the CEO and cofounder of iAsk, an AI-powered search engine, told Business Insider by email.

Others in the field say the findings demonstrate the importance of human oversight of AI.

“It’s a very ‘human’ feature, showing AI acting similarly to how people might when under pressure,” Cai GoGwilt, cofounder and chief architect at Ironclad, told BI by email. “For example, experts might exaggerate their confidence to maintain their reputation, or people in high-stakes situations might stretch the truth to please management. Generative AI works similarly. It’s motivated to provide answers that match what you expect or want to hear. But it’s, of course, not foolproof and is yet another proof point of the importance of human oversight. AI can make mistakes, and it’s our responsibility to catch them and understand why they happen.”

Leave a Reply

Your email address will not be published. Required fields are marked *