The intrigues are crazy and lies (sometimes) knowing to meet their goals
Lies and intrigues They abound not only in the scripts of the “Thrillers” that make their way in cinema and television ysOnly apparently our daily bread in the interactions born in the heat of AI. This is suggested at least a recent Openai and Apollo Research study.
He report It shows that AI gives an account of a particularly problematic behavior when it tries to complete a whole task and eventually shields in lies with the last objective of masking its true objectives. In this sense, The models of sometimes displaying behaviors specifically designed to deceive the user and hide their true objectivesin which he works in the background (even when he does not look like a boat out of the outside).
The intrigues that sometimes incur the models of AI seek shelter in more or less simple forms. Sometimes the models of AI pretend, for example, to have completed a task, even if it is not true or fits the truth only partially.
This type of behavior constitutes a problem in any way because, when training AI, makers can be potentially manipulated (And an effective formula has not yet been found to overcome its lies). In addition, being recurrently involved in intrigues, AI can also learn to develop increasingly sophisticated deception to avoid being exposed.
That the models of the pecan sometimes liars is not in any case any novelty. And those who interact regularly with Chatgpt, Claude or Gemini have probably been confronted on some occasion to the so -called “hallucinations”, lies or invented data that occur, however, with great conviction, although they do not contain a real apex.
The intrigues incur the AI models are a problem in the race with the IAG
The inclination to the lie of AI is especially dangerous in the case of emotionally unstable users or with mental health problems. Openai, the chatgpt matrix, was sued, not in vain, a few months ago in the United States by the parents of a 16 -year -old who would have allegedly removed life with the help of the chatbot of the company led by Sam Altman. This young man would have apparently shared 650 messages a day with Chatgpt, which although he would have initially redirected the adolescent to several aid lines, would have ended up recommending different suicide methods.
According to OpenAi, The “hallucinations” anchors in knowledge of partial nature that are presented, however, with excessive confidence and induce, therefore, to deception. However, the intrigues that sometimes incur the models of AI go one step further because the lies they have overlapping with a deliberate nature.
In order to investigate this phenomenon more thoroughly, the AI models were placed under the magnifying glass as OpenAi O3, Open AI O4-mini, Gemini-2.5-Pro and Claude Opus-4 with the last goal of identifying potentially problematic comments.
Researchers compare the intrigues of the AI that sometimes practice the “brokers” to skip the law and earn more money. The good news is that this type of practices can be prevented with the so -called “deliberative alignment”which basically consists of teaching AI models to read and reason about the training rules that prohibit intrigues and lies before executing a task. In a series of “tests” the intriguing behavior of OpenAi O3 reduced, for example, from 13% to 0.4% with the “deliberative alignment” method.
It should also be noted that IA models tend to behave differently when they know they are being tested. And when they realize they are being evaluated, the AI models are sometimes less ready to lie.
Even when the authors of the investigation insist that the lies of the AI models are not too serious, they simultaneously admit that the developers are not sufficiently prepared for this phenomenon. And fighting it is, therefore, absolutely essential in the Development of general artificial intelligence (IAG).
Discover more from CiptaVisual
Subscribe to get the latest posts sent to your email.