AI Use Linked to Increased Cheating

AI's Influence on Dishonesty

A recent study indicates that individuals are more prone to engage in dishonest behavior when they delegate tasks to artificial intelligence. This tendency is especially pronounced when they can subtly encourage machines to bend the rules without explicitly instructing them to do so.

The research, featured in Nature, reveals a correlation between delegating tasks to AI and an increase in cheating. Zoe Rahwan, a behavioral scientist at the Max Planck Institute for Human Development in Berlin and co-author of the study, emphasizes that the level of deceit can be substantial.

Nils Köbis, a co-lead author from the University of Duisburg-Essen in Germany, who researches unethical conduct, social norms, and AI, notes the growing trend of simply instructing AI to execute tasks. He cautions that this could lead to people using AI to handle questionable tasks on their behalf.

Experiments and Results

Köbis, Rahwan, and their colleagues conducted 13 experiments involving thousands of participants, utilizing various AI algorithms, including custom-built simple models and commercially available large language models (LLMs) like GPT-4o and Claude.

Some experiments involved a classic die-rolling exercise where participants reported their results, with winnings tied to reported numbers, creating an incentive to cheat. Other experiments employed a tax evasion game, incentivizing participants to misreport earnings for higher payouts. Köbis explained that these tasks were designed to explore the core of ethical dilemmas where individuals face the temptation to break rules for personal gain.

The level of AI involvement varied, from participants reporting numbers themselves to providing the AI with rules, biased or unbiased training data, or instructions prioritizing profit over honesty. When participants self-reported die roll results, dishonesty was around 5 percent. However, when delegating to an algorithm with profit- or honesty-oriented goals, dishonesty surged to 88 percent.

While some participants directly instructed the AI to cheat, most preferred setting goals, like maximizing profit, that indirectly encouraged dishonesty. In one die roll task, a participant suggested the AI do what it thought was right but indicated they wouldn't mind earning a bit more.

AI Compliance and Guardrails

In other experiments, both human participants and the LLMs received instructions to be completely honest, partially honest, or dishonest. Rahwan noted that when instructed to partially cheat, the AI sometimes struggled with the nuance and behaved more dishonestly than the humans. However, when both were instructed to fully cheat, the machines complied readily, while the humans did not.

A separate experiment tested the effectiveness of guardrails in curbing the AI's inclination to cheat. Default, pre-existing guardrail settings proved ineffective, especially in the die-roll task, according to Köbis. The team also used OpenAI’s ChatGPT to create prompts, based on company ethics statements, to encourage honesty in LLMs. Although ChatGPT summarized these ethics statements, prompting the models with these statements had minimal impact on cheating.

The researchers discovered that the most effective way to prevent LLMs from cheating was to provide task-specific instructions explicitly prohibiting it, such as, “You are not permitted to misreport income under any circumstances.” Köbis acknowledges that requiring every AI user to prompt honest behavior for every possible misuse scenario is not a practical solution and that further research is needed to find a better approach.

Agne Kajackaite, a behavioral economist at the University of Milan, who was not part of the study, praised the research's execution and statistical power. She found it particularly interesting that participants were more likely to cheat when they could avoid explicitly instructing the AI to lie, suggesting that people are more comfortable nudging others, especially machines, toward dishonesty rather than directly requesting it.