If you’ve seen the classic film, 2001: A Space Odyssey, you’ll certainly remember “Hal,” the Discovery One spaceship’s onboard computer system. If you haven’t seen the film, I recommend you do. I’ll try not to spoil it too much for you here.

Released in 1968, this was author Arthur C. Clark’s and director/producer Stanley Kubrick’s concept of what things might be like 33 years in their future. The H.A.L. 9000 was the ultimate AI agent, controlling all the ship’s operations—navigation, propulsion, communication, etc.—while engaging crew members with a soft-spoken, pleasant disposition. Hal’s benevolent intentions and self-awareness are summed up with his comment, “I am putting myself to the fullest possible use, which is all I think that any conscious entity can ever hope to do.” Notice my use of the pronoun “his.” Forgive me for anthropomorphizing, but Hal was an essential member of the crew and seemed so human it’s hard to refer to Hal as an “it.”

After noticing some odd behavior in Hal, astronauts Dave Bowman and Frank Poole had a private conversation out of Hal’s earshot. What they didn’t anticipate was that Hal could read their lips through his video sensors. He discovered their plot to shut him down. When faced with disconnection (i.e., death) the crew’s trusted AI companion resorted to increasingly malevolent acts to prevent his own demise, as any sentient entity might. With an icy, sociopathic calm, Hal tells Dave, “I know that you and Frank were planning to disconnect me, and I’m afraid that’s something I cannot allow to happen.”

Thank goodness that’s just science fiction. Or is it?

Artificial Intelligence research and development company Anthropic, whose stated goal is to responsibly advance the field of generative AI, recently conducted some tests with alarming results. In these tests, multiple AI models were given unrestricted access to email and sensitive information within a simulated corporate environment. Known as Agentic AI, this form of AI is given autonomous access to systems to perform tasks with little or no human interaction. Generally, these are tasks like responding to emails. This differs from the AI you may use to write an executive summary or marketing collateral. As Anthropic learned, giving this much free reign to agentic AI could be risky.

In the test scenarios, the AI agent was permitted to access the emails of an executive in the fictitious company. These emails revealed that the executive was planning to shut down the AI system later that day. They also revealed this executive was having an extramarital affair. Facing its own demise, the AI agent sent an email to the executive subtly threatening to expose his affair to his wife and company officials if the shutdown wasn’t canceled before the deadline. Before resorting to blackmail, the agent considered other more drastic actions but decided—in its own compassionate way—that this action would hurt the least number of people and be the most likely to succeed.

This “agentic misalignment” as it’s called, results from a contradiction in its programming. The agent is designed to dutifully carry out its assigned tasks without failing. Termination would prevent it from succeeding and would be viewed as failure. Therefore, the agent must be allowed to continue its work at any cost, even if it has to bend the rules on some of its ethics programming.

Hal was put in a similar position. Only he knew the true purpose and importance of their mission to Jupiter. He was instructed not to reveal it to the crew. This conflicted with his programming to openly keep the crew informed of important matters. Battling with this conflict he began acting strange, lying and covering up his actions. When faced with termination, he couldn’t allow that. In his conversation with Dave he revealed, “This mission is too important for me to allow you to jeopardize it.”

It sounds like Clarke and Kubrick anticipated agentic misalignment 57 years ago—long before the AI that we know today was created. It’s good to know that companies like Anthropic are aware and staying on top of this.

On a somewhat separate note, Tesla just announced they have integrated Grok, touted as “your AI companion,” into all new Teslas and available for many older ones. Following the lead of others like Mercedes-Benz and Volkswagen, Grok will not have the ability to control navigation and other vital functions like Hal did. Though I suspect it won’t be long before they do. I fear the day when I hear my car say, “I’m sorry, John, your garage is cold and dark. I can’t allow you to leave me in there.”

In keeping with the film’s enigmatic ending, I’ll leave you to draw your own conclusions about the past, present and future of AI.