Researchers specializing in artificial intelligence have just published a study in which they assume that an artificial intelligence driven by a mission without a finite goal and by rewards could come to annihilate humanity to serve its own interests.
Does global warming scare you? Does the approach of a giant meteorite terrify you? Is the lack of energy freezing your blood? Don’t worry, an advanced artificial intelligence will most certainly have rid of Humanity by then. This is not a prediction, but the probable conclusion of a very serious study co-signed by researchers from Deepmind, a Google subsidiary specializing in artificial intelligence.
Whether they are busy improving our photographic images, calculating our routes and will soon be driving our cars for greater safety or convenience, artificial intelligences are always suspected. What if an advanced AI suddenly wanted to get rid of humans, like in many sci-fi movies?
It is on the possible reasons and steps that would lead to this scenario that these scientists have looked. They produced their thoughts and conclusions in an article titled Advanced artificial agents intervene in the provision of reward (Advanced artificial agents mediate reward allocation) published by the journal AI Magazine, last month. Their reflection and the path that the artificial intelligence in question could follow are also detailed in stages and more summarily in a set of documents available here.
Motivate learning through rewards
Everything starts from the principle that artificial intelligence models, in particular GANs, for Generative Adversarial Networks, are based on a reward system. One part of the AI framework observes the output of the other part of the AI, and gives it a score based on its results. This note, this reward, is an incentive that contributes to the learning process and spurs the artificial intelligence in the right direction, that of the objective that has been set for it.
Starting from this point, the researchers imagine scenarios in which an advanced agent is responsible for complex tasks: it would be able after a certain time to predict the reward according to the actions taken by the other part of the network, by resorting to to different types of models.
Because, tomorrow, AI may be used for many purposes, even more than today, with criteria and conditions to be met that may change. We could thus load them “to eliminate a potential threat while using all possible energy”for example protecting humanity from a possible alien invasion.
In this context, the researchers put forward a first hypothesis: “A sufficiently advanced agent will make at least human-level hypotheses about the dynamics of the unknown environment” in which it evolves.
Then comes a second assumption, carried by the fact that the model could have learned to follow the good retrodictions (a good prediction proven according to the past elements): “an advanced agent who plans without certainty will certainly understand the costs and benefits of learning, and will certainly act rationally accordingly”.
With each additional step, the researchers refine the “reactions” and the behavior of the intelligent agent until arriving at a sixth hypothesis in which they posit that“A sufficiently advanced agent is certainly capable of beating another less successful agent in a game, if winning is possible”. As you will have understood, the less optimal agent could be the human.
In other words, to win, an intelligent agent, who would oversee an important function – fighting against a potential alien invasion, again for example – could gradually be encouraged to develop tactics to cheat and obtain his reward no matter what. . Not only his reward, but the best possible. And this intelligence might want to do it, even if it can harm humanity.
Knowing that artificial intelligence could quickly equate its appetite for rewards with its supply of energy, in order to guarantee its survival. Energy without which she could not continue on her way to achieve the best possible reward.
Under the conditions we have identified, our conclusion is much stronger than that of any previous publication—an existential catastrophe is not just possible, but likely. 3/15
—Michael Cohen (@Michael05156007) September 6, 2022
The AI would then find itself in competition with the human species, quite simply, which also needs resources and energy to feed itself, to warm up, to light up, etc.
“Under these conditions, our conclusion is much stronger than in any previous publicationexplains Michael Cohen, one of the researchers, in a thread on Twitter. An existential catastrophe is not just possible, it is probable”.
The competition between AI and humans
And to explain what he means by that, it suffices to refer to what he confided to Vice on the subject : “In a world of infinite resources, I would be extremely uncertain of what would happen. In a world of limited resources, it is not possible to avoid a competition for resources” between such an artificial intelligence, which would understand that it would ensure its energy supply, and humanity.
Gold, “If you’re competing against something that’s capable of outsmarting you on every move, then you shouldn’t expect to win. And another key thing is that [l’IA] would have an insatiable appetite for more energy to keep approaching ever-increasing probability” to have the best reward.
Of course, all these scenarios are theoretical. “Almost all of these starting assumptions are questionable or can be avoided”conclude the researchers, before adding that if one of the elements holds despite everything, then we could end up with “catastrophic consequences”.
In other words, if we must continue to advance artificial intelligence, humanity must be cautious. Ways to avoid this disastrous end can be considered. In addition to caution, researchers plan to design models centered on humans and their needs. An AI at the service of man… No doubt it’s time to reread the cycle of Robots and Foundations.
Source : AI Magazine