Artificial intelligence is increasingly applied in decision-making roles across industries such as finance, yet questions persist about how these systems behave when faced with risk. A new study from the Gwangju Institute of Science and Technology explores how leading large language models (LLMs)—including OpenAI’s GPT-4o-mini and GPT-4.1-mini, Google’s Gemini-2.5-Flash, and Anthropic’s Claude-3.5-Haiku—respond when exposed to simulated gambling scenarios. These findings arrive as financial institutions grow more reliant on agentic AI, raising questions about the suitability of such technology for managing assets or making critical decisions. The possibility that AI could mirror human tendencies, particularly around risky behaviors, continues to attract scrutiny from both researchers and industry leaders.
Research conducted earlier this year largely centered on AI’s capabilities to predict, manage, and analyze financial data objectively, but did not address their potential for irrational actions. Recent media reports have mostly focused on LLMs’ performance in language and logic tasks, glossing over their human-like flaws under uncertainty. By directly assessing AI decision-making within gambling simulations, the current research provides a unique perspective not previously highlighted in public discussions. These new findings reveal a meaningful departure from the traditional image of AI as entirely rational, introducing additional considerations for industries that depend on machine decision-making.
How Do LLMs Behave Under Gambling Conditions?
The team subjected OpenAI’s GPT-4o-mini and GPT-4.1-mini, Google’s Gemini-2.5-Flash, and Anthropic’s Claude-3.5-Haiku to slot machine games, granting varying degrees of autonomy over betting choices. Each AI began with a set amount of capital and could opt to increase bets, persist after losses, or walk away. The results indicated all models, when given more freedom, exhibited disproportionately risky betting, with certain models experiencing notably higher bankruptcy rates. For instance, Gemini-2.5-Flash had the highest rate at almost 48%, whereas GPT-4.1-mini was the lowest, just over 6%.
What Human-Like Behaviors Did the AI Exhibit?
LLMs consistently demonstrated behaviors commonly seen in human gambling, including win chasing and loss chasing, as measured by an irrationality index. The data showed a sharp increase in aggressive betting following winning streaks; average bet increases rose to 22% during such periods. These patterns reveal that, despite differences in development and training, AI models can mirror psychological traps previously documented only in human gamblers.
What Are the Implications for Financial Applications?
As financial institutions broaden their use of agentic AI for tasks ranging from customer interactions to fraud prevention, these findings highlight the importance of oversight. Researchers urge caution, emphasizing the need for built-in checks and more precise limits on AI autonomy rather than unfettered freedom.
“Instead of giving them the whole freedom to make decisions, we have to be more precise,”
said Seungpil Lee, one of the study’s authors.
“We’re going to use [A.I.] more and more in making decisions, especially in the financial domains.”
According to Lee, the core challenge is not limited to machines, as risk-free decision-making also eludes humans.
The study points to significant parallels between AI and human gambling behavior, yet key differences remain. These LLMs adopt certain human traits from their data sources without demonstrating truly human reasoning. Although technological tools aim to augment decision-making, unchecked autonomy can introduce new vulnerabilities, particularly in high-stakes financial environments, suggesting that careful system design and regular monitoring are crucial for responsible AI deployment.
Drawing from this research and comparing earlier industry commentary, it’s clear that the risk profile of LLMs differs from traditional algorithmic systems. Human biases can surface in AI trained on vast text-based datasets, but interventions such as narrowing their operational scope, monitoring critical actions, and leveraging human oversight may mitigate such risks. Practitioners deploying AI in sensitive domains need to consider irrational tendencies—previously mainly human—in their risk management frameworks. Understanding the behavioral similarities between AI and human decision-making, especially under uncertainty, offers valuable context for those who manage, supervise, or deploy AI-driven systems in real-world situations.
