Some AI-based systems may start to “cheat”, with concrete consequences for humanity.
As impressive as it is, many observers like Elon Musk agree that technologies related to artificial intelligence have many risks that should be expected today. This is also the conclusion of a chilling new research paper whose authors believe that this technology represents a real existential threat to humanity.
This is not the first time we have seen this discourse resurface; although this assertion rests on more serious grounds, it is often accompanied by more caricatured arguments, not to say completely fanciful.
But this time, the situation is very different. It starts with the identity of these whistleblowers. These aren’t just some cranks blowing the air in the depths of a dark forum; these works are due to very serious researchers from reliable and prestigious institutions, viz Oxford University and DeepMindone of the world leaders in artificial intelligence.
Cadors, in short, who don’t step up to the plate without a valid reason. And when they also began to prove that humanity has greatly underestimated the dangers of AI, sounds better. Especially since they present technical arguments that seem more than convincing.
GANs, (also?) powerful programs
Their postulate is contained in a sentence that is also the title of their research paper: “ Advanced artificial agents mediate the reward process “. To understand this tortuous assertion, we need to start by looking at the concept of Generative Adversarial Network, or GAN.
GANs are programs designed by engineer Ian Goodfellow. Very briefly, they work thanks to two relatively independent subroutines that contradict each other – hence the term ” against “. On the one hand, we have a fairly standard neural network that learns iterations.
On the other hand, there is a second network that manages the training of the first one. Like a teacher, he reviews his friend’s findings to let him know if the learning is progressing in the desired direction. If the results are satisfactory, the first network receives a ” reward which encouraged him to persevere in the same direction. Otherwise, he will inherit a rebuke that tells him that he followed the wrong lead.
It’s a concept that works really well, especially since GANs are now being used in many areas. The problem is that once pushed into these entrenchments, this architecture can have a more catastrophic outcome.
The key claim of the paper is in the title: Advanced Artificial Agents Intervening in Reward Provision. We further argue that AI that intervenes in awarding their rewards will have dire consequences. 2/15
—Michael Cohen (@Michael05156007) September 6, 2022
What if the AI cheats?
This model can then push the AI to develop a strategy that will allow it to “ to intervene in the reward process », as the title of the paper explains. In other words, these algorithms can start ” dishonesty to get as many “rewards” as possible… even if it means leaving people behind.
And what makes this paper so disturbing and interesting is that it’s not about killer robots or other fantastic predictions modeled on science fiction; the disaster scenario proposed by the researchers is based on a concrete problem, which is the amount of resources available on our planet.
The authors imagine a kind of huge zero-sum game, in which on the one hand, a person must maintain himself, and on the other, a program that will use all available resources this without the slightest consideration, just to get. these great rewards.
In essence, the program acts like a puppy trained to steal kibble directly from the bag rather than responding to its master’s commands to get the reward.
Consider, for example, a medical AI designed to diagnose severe pathologies. In such a scenario, the program may find a way to ” dishonesty to get his reward, even if he offers a wrong diagnosis. He no longer has the slightest interest in diagnosing diseases correctly.
Instead, he contented himself with producing completely false results in the industrial quantities entitled to his shot, even if it meant a complete departure from his first goal and the distribution of all electricity available in the network.
A different approach to human-machine competition
And this is only the tip of a huge iceberg. ” In a world with limited resources, it is inevitable that there will be competition for resources “explained Michael Cohen, lead author of the study, in an interview with Motherboard.” And if you’re competing with something that can come out on top every time, you can’t expect to win. “, he hammered.
Winning the competition of “using the last bit of available energy” while playing against something smarter than us might be very difficult. Loss can be fatal. 12/15
—Michael Cohen (@Michael05156007) September 6, 2022
” And losing this game can be fatal “, he insisted. With his team, he therefore concluded that the extermination of humanity by AI is no longer ” possible »; now” LIKELY if AI research continues at its current pace.
And this is where the shoe fits. Because this technology is a great tool that is already working wonders in many areas. And this is probably just the beginning; AI in its broadest sense still has great potential, the full extent of which we may not yet understand. Today, AI is undoubtedly an added value for humanity, and therefore there is a real interest in pushing this work as far as possible.
The precautionary principle should have the last word
But this also means that we are getting closer and closer to this scenario that smacks of dystopia. Obviously, it must be remembered that they remain in the present hypothetical. But the researchers therefore insist on the importance of maintaining control over this task; he believes that it is useless to give in to the temptation of unrestrained research, knowing that we are still far from exploring all the possibilities of current technologies.
” Given our current understanding, this is not something worth developing unless you do some serious work to figure out how to control it. Cohen concluded.
Without surrendering to danger however, these works are a reminder that it is necessary be very careful at every major stage of AI research, and even more so when it comes to deploying them in critical systems.
Finally, those looking for a moral to this story can base themselves on the conclusion of the excellent Wargames; this highly anticipated film released in 1983 and still relevant today treats this theme very admirably. And as WOPR said so well in the last scene, the only way to win this amazing game will be… to avoid playing.
The text of the study is available here.