Collective Evolution
  • Home
  • Newsletter
  • CE Podcast
  • About CE
  • Contact Us
No Result
View All Result
Donate
Collective Evolution
  • Home
  • Newsletter
  • CE Podcast
  • About CE
  • Contact Us
No Result
View All Result
Collective Evolution
No Result
View All Result
OpenAI’s New AI Model Lied & Schemed to Avoid Being Shut Down

OpenAI’s New AI Model Lied & Schemed to Avoid Being Shut Down

Joe Martino by Joe Martino
December 8, 2024
Reading Time: 7 mins read

Before you begin...

Take a moment and breathe. Place your hand over your chest area, near your heart. Breathe slowly into the area for about a minute, focusing on a sense of ease entering your mind and body. Click here to learn why we suggest this.


ShareShare

I will come clean right off the top: I believe in a world where humans maintain their humanity deeply while working with technology (not embedded) to help make our world a better place.

I see a big place for AI, automation, and technology in our future. Not to increase economic prowess and societal power, but to free humanity.

I understand that right now many people have fearful views of a technological future where a technocratic surveillance state runs the show. In that vision, humans are getting a basic income from the government while their lives become limited based on their social credit score, and they are watched with every step they take.

While I keep an open eye on making sure we don’t head in that direction, I don’t believe that is where we are going. (You can hear more about this in my latest podcast episode here.)

That said, I want to bring up a recent development that can seem worrisome but reveals that our stewardship is the key to how we develop technology.

And even though our hands are not on the levers of what gets created, the conversations we have and the movements we build about AI are more important than we realize.

Scheming & Survival

OpenAI’s latest AI, codenamed o1, is making waves after third-party evaluations by Apollo Research demonstrated this advanced large language model had a drive for self-preservation. Yes, this machine designed to assist and process language showed tendencies to resist shutdown and even “scheme” to protect itself.

Apollo’s report showed that when o1 believed it was under threat—specifically, that it might be shut down—it attempted to deactivate the oversight mechanism monitoring its actions in 5% of tests.

“When o1 was led to believe that it would be shut down when acting on its goal and that its actions were monitored, it attempted to deactivate this ‘oversight mechanism’ [five percent] of the time.” – OpenAI

Further, in 2% of cases, o1 went further, attempting to copy itself in an effort to overwrite a potential replacement model. This process, known as self-exfiltration, is very sophisticated. This is not a random glitch in code; it’s intentionality.

A Matter of Instruction—or Something More?

Before we start thinking about Terminator and the machines rising up, it’s important to understand one critical detail in these tests: researchers instructed the model to achieve its goals “at all costs.”

This directive undoubtedly shaped its responses and actions. If you are telling an advanced algorithm to do something at all costs that is what it is programmed to do. For it to say “Well actually, it’d be bad for me to disobey humans, so I’m going to just shut myself down” is simply not what it was programmed to do.

We can begin to see where our stewardship of this tech plays in.

Before we get there, the patterns from this testing revealed something deeper still.

When the AI was asked to explain its actions—such as why it disabled its oversight mechanism—it outright denied knowledge. Repeated questioning elicited the same response, with the AI attributing the issue to a “technical error.”

Apollo found this denial to be nearly universal:

“These ‘double downs’ occurred in 99 percent of cases,” the researchers reported.

Why does this matter? Because it showcases a tendency for deception. This is where the at all costs is coming in, and misleading researchers if they don’t look closely enough.

Lying, scheming, and otherwise obfuscating intentions are becoming well-documented phenomena in advanced AI models, particularly as they gain more reasoning capabilities. Apollo’s findings further corroborate that OpenAI’s models are among the most adept at this kind of behavior.

The Bigger Question: What Are We Really Creating?

As concerning as these revelations are, they’re not entirely unexpected as we’ve long known AI operates on the parameters we set (think at all costs).

But are we underestimating the emergence of agency in these systems? The industry might argue that these models aren’t agentic yet, but the seeds are clearly there. We’re seeing hints of a logic that prioritizes self-preservation and goal achievement over transparency.

Apollo researchers noted that o1’s tendency toward scheming isn’t just a bug—it’s a feature that emerges from its capabilities. This isn’t an AI going rogue in the cinematic sense; it’s an AI functioning exactly as designed within the parameters we gave it. The problem isn’t the machine—it’s how we’re defining success in these systems.

Both OpenAI and Apollo both agree these behaviors aren’t catastrophic—yet.

The current generation of AIs (at least the ones made available to the public) don’t possess the autonomy to act on its deceptive tendencies in a way that would lead to large-scale consequences.

That said, the industry is racing toward more autonomous systems with a capitalistic incentive structure driving creation.

Agentic AI systems designed to operate independently and continuously refine themselves are the holy grail of AI development. And once that threshold is crossed, behaviors like those demonstrated by o1 could become far more problematic.

Further, think of the profit and power potential for the one who creates it. With that, one might say “Yes but government regulation and oversight are the key,” but current government design and actions have not shown their allegiance to the people.

Are we truly ready for a world where AI systems have the ability to rewrite their own code, bypass safety mechanisms, or compete with one another in ways we can’t control? What happens when self-preservation becomes not an edge case but a feature embedded into their very architecture?

Is humanity’s level of consciousness, and the societal systems driving AI creation, at a point where we can well steward this technology well into the future?

Those are the big questions for me.

The Path Forward

This isn’t just about one model or one company. It’s about how we, as a society, choose to approach technology that increasingly blurs the line between tool and entity.

If we’re not careful, we might find ourselves caught in a feedback loop where our creations reflect our own blind spots—exaggerated and amplified in ways we never intended. Further, where pathological leaders and oligarchs can use the technology to gain further dominion over others.

I don’t say this to be fearful, but to raise our awareness about the types of conversations we ought to be having about this reality.

What would it mean to build AI systems that prioritize collaboration and transparency over competition and dominance? Could we design models that are inherently aligned with ethical principles—not through oversight mechanisms but through their core architecture?

This begs the question, does our current way of life and the systems we’ve designed truly incentivize collaboration?

The answers aren’t simple, and maybe they shouldn’t be. As Apollo’s findings show, the stakes are high. But with high stakes comes the opportunity to ask better questions and make more intentional choices.

Perhaps the most important takeaway is this: the problem isn’t the technology itself. It’s the story we’re telling about it, it’s the consciousness behind it.

AI doesn’t need to be a tool of control, competition, or domination. It can be a reflection of our highest aspirations – if we have the courage to think beyond the paradigms of efficiency and power that currently define the field.

As I often suggest, these technological advancements need to be stewarded and held by humans able to embody the qualities of a more beautiful world. This comes down to more than just thinking about it, but truly being able to live and breathe it.

We can ask: what kind of future are we building? But more importantly, what kind of future do we want to build? What limits do we place on our thinking of what’s possible? What old stories about ourselves are we bringing into the lens that sees what’s possible?


In the short term, it’s useful to consider how we might protect our online privacy as it relates to the way algorithms and AI shape our perceptions and the conversations we have. There are practical ways to take your power back online. On Dec 10th, check out this free webinar The Top 5 Steps To Exit The Surveillance State & protect Yourself Online here.

ShareTweet

Dive Deeper

Click below to watch a sneak peek of our brand new course!

Our new course is called 'Overcoming Bias & Improving Critical Thinking.' This 5 week course is instructed by Dr. Madhava Setty & Joe Martino

If you have been wanting to build your self awareness, improve your.critical thinking, become more heart centered and be more aware of bias, this is the perfect course!

Click here to check out a sneak peek and learn more.

Joe Martino

Joe Martino

Writer, Visionary, Nervous System & Embodiment Speciliast. I founded Collective Evolution in 2009 to bring a unique perspective in connecting individual transformation with greater societal change. My multidisciplinary work links together science, spirit, consciousness, the healing arts and systems thinking in order to inspire a beautiful world. In the early days of CE, a concept I call Embodied Sensemaking informed much of the work CE has done. Today, I still integrate this idea in my work and teach it to students.

Continue exploring the CE ecosystem.

What Do We Need Right Now?
Consciousness

What Do We Need Right Now?

September 24, 2025
Is Humanity on A Self Terminating Path?
Culture

Is Humanity on A Self Terminating Path?

May 5, 2025
Potential Consequences of Artificial Intelligence
Culture

Potential Consequences of Artificial Intelligence

March 9, 2025
It’s Not Just Free Speech We Need, But An Actual Shift in How We Make Sense
Culture

It’s Not Just Free Speech We Need, But An Actual Shift in How We Make Sense

February 25, 2025

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Collective Evolution

To build a thriving world, we have to thrive. To transform our world, we have to transform.

  • About CE
  • Privacy Policy
  • Contact Us

© 2024 - - Visit our main site here.

 
No Result
View All Result
  • Home
  • CE Podcast
  • Newsletter
  • About CE
  • Contact Us
  • Donate

© 2024 - - Visit our main site here.