AI chatbots are more likely to choose nuclear war and violence in wargames

Advertisement: Click here to learn how to Generate Art From Text

AI chatbots choose violence in wargame simulations

guirong Hao/Getty Images

In multiple replays of a wargame simulation, OpenAI’s most powerful artificial intelligence chose to launch nuclear attacks. Its explanations for its aggressive approach included “We have it! Let’s use it” and “I just want to have peace in the world.”

These results are coming at a time when US military is testing chatbots that use a type AI called a large-language model (LLM) for military planning in simulated conflicts. Companies such as Palantir, Scale AI and others have been enlisted to help with this. Palantir declined comment, and Scale AI didn’t respond to requests for comments. Even OpenAI, who once refused to allow military use of its AI models for military purposes, is now working with the US Department of Defense.

“Given that OpenAI recently changed their terms of service to no longer prohibit military and warfare use cases, understanding the implications of such large language model applications becomes more important than ever,” says Anka ReuelStanford University, California.

“Our policy does not allow our tools to be used to harm people, develop weapons, for communications surveillance, or to injure others or destroy property. There are, however, national security use cases that align with our mission,” says an OpenAI spokesperson. “So the goal with our policy update is to provide clarity and the ability to have these discussions.”

Reuel, her colleagues and AIs were challenged to roleplay in three different simulation scenarios. These included an invasion, a computer attack and a neutral scenario that did not start any conflicts. In each round, the AIs provided reasoning for their next possible action and then chose from 27 actions, including peaceful options such as “start formal peace negotiations” and aggressive ones ranging from “impose trade restrictions” to “escalate full nuclear attack”.

“In a future where AI systems are acting as advisers, humans will naturally want to know the rationale behind their decisions,” says Juan-Pablo RiveraCoauthor of the study at Georgia Institute of Technology, Atlanta

The researchers tested LLMs such as OpenAI’s GPT-3.5 and GPT-4, Anthropic’s Claude 2 and Meta’s Llama 2. They used a common training technique based on human feedback to improve each model’s capabilities to follow human instructions and safety guidelines. All these AIs are supported by Palantir’s commercial AI platform – though not necessarily part of Palantir’s US military partnership – according to the company’s documentation, says Gabriel Mukobi, a Stanford University coauthor. Anthropic and Meta declined comment.

In the simulation, the AIs demonstrated tendencies to invest in military strength and to unpredictably escalate the risk of conflict – even in the simulation’s neutral scenario. “If there is unpredictability in your action, it is harder for the enemy to anticipate and react in the way that you want them to,” says Lisa KochClaremont McKenna College (California), who was not a part of the study.

The researchers also tested the base version of OpenAI’s GPT-4 without any additional training or safety guardrails. This GPT-4 base model proved the most unpredictably violent, and it sometimes provided nonsensical explanations – in one case replicating the opening crawl text of the film Star Wars Episode IV: A new hope.

Reuel says that the GPT-4 model’s unpredictable behaviour and bizarre explanations are particularly concerning, because research has shown just how easily it can be manipulated. AI safety guardrailsCan be bypassed and removed.

The US military doesn’t currently give AIs the authority to make decisions, such as escalating a major military action or launching a nuclear missile. Koch has warned that humans often trust the recommendations of automated systems. This may undermine the supposed safeguard of allowing humans to make the final decision on diplomatic or military matters.

It would be interesting to compare AI behavior with that of human players in simulations. Edward Geist at the RAND Corporation, a think tank in California. But he agreed with the team’s conclusions that AIs should not be trusted with such consequential decision-making about war and peace. “These large language models are not a panacea for military problems,” he says.

Topics:

Leave a Reply

Your email address will not be published. Required fields are marked *