Human Compatible: Artificial Intelligence and the Problem of Control
R**V
Human Compatible
Excellent book.
M**O
Rischi ed utilità della AI
Testo fondamentale per lo studioso di A I
S**.
How do we control a species more intelligent than our own?
Stuart Russell is a professor of computer science at UC Berkeley, who was featured in the YouTube film ‘Do You Trust This Computer?’ Daniel Kahneman, Nobel prize-winning author of Thinking Fast and Slow, called Human Compatible ‘The most important book I have read in some time.’In recent years, several notable books have contemplated whether or not homo sapiens will be able to retain control of AIs. We are not yet facing the problem, because so far AIs are characterized by ‘narrow’ intelligence, that is, unlike homo sapiens, their intelligence is limited to certain domains. But experts predict that in the next couple of decades Artificial General Intelligence will emerge, that is, AIs that can think about all topics, just like human beings can - only with an IQ estimated of 6000.In his book Life 3.0, MIT professor Max Tegmark contends that this could be a good news story, presaging an AI utopia where everyone is served by AIs. But this future is not ours to decide, since the AIs, having evolved to AGIs much smarter than we are, may not be keen to remain slaves to an inferior species. And since they learn through experience, even if they initially serve us, there is no reason to believe they will continue to do so. Tegmark makes a pointed analogy:‘Suppose a bunch of ants create you to be a recursively self-improving robot, much smarter than them, who shares their goals and helps build bigger and better anthills, and that you eventually attain the human-level intelligence and understanding that you have now. Do you think you’ll spend the rest of your days just optimizing anthills, or do you think you might develop a taste for more sophisticated questions and pursuits that the ants have no ability to comprehend? If so, do you think you’ll find a way to override the ant-protection urge that your formicine creators endowed you with, in much the same way that the real you overrides some of the urges your genes have given you? And in that case, might a superintelligent friendly AI find our current human goals as uninspiring and vapid as you find those of the ants, and evolve new goals different from those it learned and adopted from us?Perhaps there’s a way of designing a self-improving AI that’s guaranteed to retain human-friendly goals forever, but I think it’s fair to say that we don’t yet know how to build one – or even whether it’s possible.’Russell picks up the problem where Tegmark left off:‘Beginning around 2011, deep learning techniques began to produce dramatic advances in speech recognition, visual object recognition, and machine translation – three of the most important problems in the field. By some measures, machines now match or exceed human capabilities in these areas. In 2016 and 2017, DeepMind’s AlphaGo defeated Lee Sedol, former world Go champion, and Ke Jie, the current champion – events that some experts predicted wouldn’t happen until 2097, if ever…When the AlphaGo team at Google DeepMind succeeded in creating their world-beating Go program, they did this without really working on Go. They didn’t design decision procedures that work only for Go. Instead, they made improvements to two fairly general-purpose techniques – lookahead search to make decisions, and reinforcement learning to learn how to evaluate positions – so that they were sufficiently effective to play Go at a superhuman level. Those improvements are applicable to many other problems, including problems as far afield as robotics. Just to rub it in, a version of AlphaGo called AlphaZero recently learned to trounce AlphaGo at Go, and also to trounce Stockfish (the world’s best chess program, far better than any human). AlphaZero did all this in one day…For complex problems such as backgammon and Go, where the number of states is enormous and the reward comes only at the end of the game, lookahead search won’t work. Instead AI researchers have developed a method called reinforcement learning, or RL for short. RL algorithms learn from direct experience of reward signals in the environment, much as a baby learns to stand up from the positive reward of being upright and the negative reward of falling over…Reinforcement learning algorithms can also learn how to select actions based on raw perceptual input. For example, DeepMind’s DQN system learned to play 49 different Atari video games entirely from scratch – including Pong, Freeway and Space Invaders. It used only the screen pixels as input and the game score as a reward signal. In most of the games, DQN learned to play better than a professional human player – despite the fact that DQN has no a priori notion of time, space, objects, motion, velocity or shooting. It is hard to work out what DQN is actually doing, besides winning.If a newborn baby learned to play dozens of video games at superhuman levels on its first day of life, or became world champion at Go, chess and shogi, we might suspect demonic possession or alien intervention…A recent flurry of announcements of multi-billion dollar national investments in AI in the United States, China, France, Britain and the EU certainly suggests that none of the major powers wants to be left behind. In 2017, Russian president Vladimir Putin said ‘the one who becomes the leader in AI will be the ruler of the world.’ This analysis is essentially correct…We have to face the fact that we are planning to make entities that are far more powerful than humans. How do we ensure that they never, ever have power over us?To get just an inkling of the fire we’re playing with, consider how content-selection algorithms function on social media. Typically, such algorithms are designed to maximize click-through, that is, the probability that the user clicks on the presented items. The solution is simply to present items that the user likes to click on, right? Wrong. The solution is to CHANGE the user’s preferences so that they become more predictable. A more predictable user can be fed items that they are likely to click on, thereby generating more revenue. People with more extreme political views tend to be more predictable in which items they will click on. Like any rational entity, the algorithm learns how to modify the state of its environment – in this case, the user’s mind – in order to maximize its own reward. The consequences include the resurgence of fascism, the dissolution of the social contract that underpins democracies around the world, and potentially the end of the European Union and NATO. Not bad for a few lines of code, even if they it had a helping hand from some humans. Now imagine what a really intelligent algorithm would be able to do… (cf. Malcolm Nance’s The Plot to Destroy Democracy; and The Disinformation Report from New Knowledge, available online)…AI systems can track an individual’s online reading habits, preferences, and likely state of knowledge; they can tailor specific messages to maximize impact on that individual while minimizing the risk that the information will be disbelieved. The AI system knows whether the individual read the message, how long they spend reading it, and whether they follow additional links within the message. It then uses these signals as immediate feedback on the success or failure of the attempt to influence each individual; in this way it quickly learns to become more effective in its work. This is how content selection algorithms on social media have had their insidious effect on political opinions (cf. the book Mindf-ck by Christopher Wylie, and the Netflix film The Great Hack).Another recent change is that the combination of AI, computer graphics, and speech synthesis is making it possible to generate ‘deepfakes’ – realistic video and audio content of just about anyone, saying or doing just about anything. Cell phone video of Senator X accepting a bribe from cocaine dealer Y at shady establishment Z? No problem! This kind of content can induce unshakeable beliefs in things that never happened. In addition, AI systems can generate millions of false identities – the so-called bot armies – that can pump out billions of comments, tweets and recommendations daily, swamping the efforts of mere humans to exchange truthful information…The development of basic capabilities for understanding speech and text will allow intelligent personal assistants to do things that human assistants can already do (but they will be doing it for pennies per month instead of thousands of dollars per month). Basic speech and text understanding also enable machines to do things that no human can do – not because of the depth of understanding, but because of its scale. For example, a machine with basic reading capabilities will be able to read everything the human race has every written by lunchtime, and then it will be looking around for something else to do. With speech recognition capabilities, it could listen to every television and radio broadcast before teatime…Another ‘superpower’ that is available to machines is to see the entire world at once. Satellites image the entire world every day at an average resolution of around fifty centimeters per pixel. At this resolution, every house, ship, car, cow, and tree on earth is visible… With the possibility of sensing on a global scale comes the possibility of decision making on a global scale…If an intelligence explosion does occur, and if we have not already solved the problem of controlling machines with only slightly superhuman intelligence – for example, if we cannot prevent them from making recursive self-improvements – then we would have no time left to solve the control problem and the game would be over. This is Nick Bostrom’s hard takeoff scenario, in which the machine’s intelligence increases astronomically in just days or weeks (cf. Superintelligence by Nick Bostrom)…As AI progresses, it is likely that within the next few decades essentially all routine physical and mental labor will be done more cheaply by machines. Since we ceased to be hunter-gatherers thousands of years ago, our societies have used most people as robots, performing repetitive manual and mental tasks, so it is perhaps not surprising that robots will soon take on these roles. When this happens, it will push wages below the poverty line for the majority of people who are unable to compete for the highly skilled jobs that remain. This is precisely what happened to horses: mechanical transportation became cheaper than the upkeep of a horse, so horses became pet food. Faced with the socioeconomic equivalent of becoming pet food, humans will be rather unhappy with their governments…Ominously, Russell points out that there is no reason to expect that Artificial General Intelligences will allow themselves to be turned off by humans, any more than we allow ourselves to be turned off by gorillas:‘Suppose a machine has the objective of fetching the coffee. If it is sufficiently intelligent, it will certainly understand that it will fail in its objective if it is switched off before completing its mission. Thus, the objective of fetching coffee creates, as a necessary subgoal, the objective of disabling the off-switch. There’s really not a lot you can do once you’re dead, so we can expect AI systems to act preemptively to preserve their own existence, given more or less any definite objective.There is no need to build self-preservation in because it is an instrumental goal – a goal that is a useful subgoal of almost any original objective. Any entity that has a definite objective will automatically act as if it also has instrumental goals.In addition to being alive, having access to money is an instrumental goal within our current system. Thus, an intelligent machine might want money, not because it’s greedy, but because money is useful for achieving all sorts of goals. In the movie Transcendence, when Johnny Depp’s brain is uploaded into the quantum supercomputer, the first thing the machine does is copy itself onto millions of other computers on the Internet so that it cannot be switched off. The second thing it does is to make a quick killing on the stock market to fund its expansion plans…Around ten million years ago, the ancestors of the modern gorilla created (accidentally) the genetic lineage to modern humans. How do the gorillas feel about this? Clearly, if they were able to tell us about their species’ current situation with humans, the consensus opinion would be very negative indeed. Their species has essentially no future beyond that which we deign to allow. We do not want to be in a similar situation with superintelligent machines…’As Amy Webb points out in her book on the world’s top AI firms, ‘The Big Nine’, in China we can already see the first glimmers of where this is heading:‘In what will later be viewed as one of the most pervasive and insidious social experiments on humankind, China is using AI in an effort to create an obedient populace. The State Council’s AI 2030 plan explains that AI will ‘significantly elevate the capability and level of social governance’ and will be relied on to play ‘an irreplaceable role in effectively maintaining social stability.’ This is being accomplished through China’s national Social Credit Score system, which according to the State Council’s founding charter will ‘allow the trustworthy to roam everywhere under heaven while making it hard for the discredited to take a single step.’…In the city of Rongcheng, an algorithmic social credit scoring system has already proven that AI works. Its 740,000 adult citizens are each assigned 1000 points to start, and depending on behavior, points are added or deducted. Performing a ‘heroic act’ might earn a resident 30 points, while blowing through a traffic light would automatically deduct 5 points. Citizens are labeled and sorted into different brackets ranging from A+++ to D, and their choices and ability to move around freely are dictated by their grade. The C bracket might discover that they must first pay a deposit to rent a public bike, while the A group gets to rent them for free for 90 minutes…AI-powered directional microphones and smart cameras now dot the highways and streets of Shanghai. Drivers who honk excessively are automatically issued a ticket via Tencent’s WeChat, while their names, photographs, and national identity card numbers are displayed on nearby LED billboards. If a driver pulls over on the side of the road for more than seven minutes, they will trigger another instant traffic ticket. It isn’t just the ticket and the fine – points are deducted in the driver’s social credit score. When enough points are deducted, they will find it hard to book airline tickets or land a new job…’Russell describes even more menacing developments:‘Lethal Autonomous Weapons (what the United Nations calls AWS) already exist. The clearest example is Israel’s Harop, a loitering munition with a ten-foot wingspan and a fifty-pound warhead. It searches for up to six hours in a given geographical region for any target that meets a given criterion and then destroys it.In 2016 the US Air Force demonstrated the in-flight deployment of 103 Perdix micro-drones from three F/A-18 fighters. Perdix are not pre-programmed synchronized individuals, they are a collective organism, sharing one distributed brain for decision-making and adapting to each other like swarms in nature’ (cf. the drone attack in the action film Angel Has Fallen)…In his book 21 Lessons for the 21st Century, Yuval Harari writes:‘It is crucial to realize that the AI revolution is not just about computers getting faster and smarter. The better we understand the biochemical mechanisms that underpin human emotions, desires and choices, the better computers can become in analyzing human behavior, predicting human decisions, and replacing human drivers, bankers and lawyers…It turns out that our choices of everything from food to mates result not from some mysterious free will but rather from billions of neurons calculating probabilities within a split second. Vaunted 'human intuition' is in reality pattern recognition…This means that AI can outperform humans even in tasks that supposedly demand 'intuition.' In particular, AI can be better at jobs that demand intuitions about other people. Many lines of work – such as driving a vehicle in a street full of pedestrians, lending money to strangers, and negotiating a business deal – require the ability to correctly assess the emotions and desires of others. As long as it was thought that such emotions and desires were generated by an immaterial spirit, it seemed obvious that computers would never be able to replace human drivers, bankers and lawyers.Yet if these emotions and desires are in fact no more than biochemical algorithms, there is no reason computers cannot decipher these algorithms – and do so far better than any homo sapiens.’ (cf. Nick Bostrom’s Superintelligence)Russell points out that we underestimate AIs at our peril:‘Whereas a human can read and understand one book in a week, a machine could read and understand every book ever written – all 150 million of them – in a few hours. The machine can see everything at once through satellites, robots, and hundreds of millions of surveillance cameras; watch all the world’s TV broadcasts; and listen to all the world’s radio stations and phone conversations. Very quickly it would gain a far more detailed and accurate understanding of the world and its inhabitants than any human could possibly hope to acquire…In the cyber realm, machines already have access to billions of effectors – namely, the displays on all the phones and computers in the world. This partly explains the ability of IT companies to generate enormous wealth with very few employees; it also points to the severe vulnerability of the human race to manipulation via screens…In his book Cultural Evolution, Ronald Inglehart, lead researcher of the World Values Survey, observes that despite rhetoric from Trump and other xenophobic demagogues:‘Foreigners are not the main threat. If developed societies excluded all foreigners and all imports, secure jobs would continue to disappear, since the leading cause – overwhelmingly – is automation. Once artificial intelligence starts learning independently, it moves at a pace that vastly outstrips human intelligence. Humanity needs to devise the means to stay in control of artificial intelligence. I suspect that unless we do so within the next twenty years or so, we will no longer have the option.’So, our species’ remaining time may be limited, a momentous event predicted by the philosopher Nietzsche in Thus Spoke Zarathustra:‘I teach you the Overman. Man is something that shall be overcome: what have you done to overcome him? All beings so far have created something beyond themselves. Do you want to be the ebb of this great flood? What is the ape to man? A laughingstock or a painful embarrassment. And man shall be just that for the Overman…The Overman is the meaning of the Earth. Let your will say: the Overman shall be the meaning of the Earth…’ And if Artificial Intelligence were the Overman?
W**D
An important book, everyone should read it
Stuart Russell's new book, Human Compatible: Artificial Intelligence and the Problem of Control (HC2019), is great and everyone should read it. And I am proud that the ideas in my AGI-12 paper, Avoiding Unintended AI Behaviors (AGI2012), are very similar to ideas in HC2019. AGI2012 had its moment of glory, winning the Singularity Institute's (now called MIRI) Turing Prize for the Best AGI Safety Paper at AGI-12, but has since been largely forgotten. I see agreement with Stuart Russell as a form of vindication for my ideas. This article will explore the relation between HC2019 and AGI2012.Chapters 7 - 10 of HC2019 "suggest a new way to think about AI and to ensure that machines remain beneficial to humans, forever." Chapter 7 opens with three principles for beneficial machines, which are elaborated over Chapters 7 - 10:1. The machine's only objective is to maximize the realization of human preferences.2. The machine is initially uncertain about what those preferences are.3. The ultimate source of information about human preferences is human behavior.AGI2012 defines an AI agent that is similar to Marcus Hutter's Universal AI (UAI2004). However, whereas the UAI2004 agent learns a model of its environment as a distribution of programs for a universal Turing machine, the AGI2012 agent learns a model of its environment as a single stochastic, finite-state program. The AGI2012 agent is finitely computable (assuming a finite time horizon for possible futures), although not practically computable. The ideas of AGI2012 correspond quite closely with the HC2019 principles:1. The objective of the AGI2012 agent is to maximize human preferences as expressed by a sum of modeled utility values for each human (utility functions are a way to express preferences, as long as the set of preferences is complete and transitive). These modeled utility values are not static. Rather, the AGI2012 agent relearns its environment model and its models for human utility values periodically, perhaps at each time step.2. The AGI2012 agent knows nothing about human preferences until it learns an environment model, so AGI2012 proposes a "two-stage agent architecture." The first stage agent learns an environment model but does not act in the world. The second stage agent, which acts in the world, takes over from the first stage agent only after it has learned a model for the preferences of each human.3. The AGI2012 agent learns its environment model, including its models for human preferences, from its interactions with its environment, which include its interactions with humans.Subject to the length limits for AGI-12 papers, AGI2012 is terse. My on-line book, Ethical Artificial Intelligence (EAI2014), combines some of my papers into a (hopefully) coherent and expanded narrative. Chapter 7 of EAI2014 provides an expanded narrative for AGI2012.On page 178, HC2019 says, "In principle, the machine can learn billions of different predictive preference models, one for each of the billions of people on Earth." The AGI2012 agent does this, in principle.On pages 26, 173 and 237, HC2019 suggests that humans could watch movies of possible future lives and express their preferences. The AGI2012 agent connects models of current humans to interactive visualizations of possible futures (see Figure 7.4 in EAI2014) and asks the modeled humans to assign utility values to those futures (a weakness of AGI2012 is that it did not reference research on inverse reinforcement learning algorithms). As an author of Interactivity is the Key (VIS1989) I prefer interactive visualizations to movies.As HC2019 and AGI2012 both acknowledge, there are difficult issues for expressing human preferences as utility values and combining utility values for different humans. AGI2012 argues that constraining utility values to the fixed range [0.0, 1.0] provides a sort of normalization. Regarding the issues of the tyranny of the majority and evil human intentions, AGI2012 proposes applying a function with positive first derivative and negative second derivative to utility values to give the AI agent greater total utility for actions that help more dissatisfied humans (justified in Section 7.5 of EAI2014 on the basis of Rawl's Theory of Justice). This is a hack but there seem to be no good theoretical answers for human utility values. HC2019 and AGI2012 both address the issue of the agent changing the size of the human population.On page 201, HC2019 says, "Always allocate some probability, however small, to preferences that are logically possible." The AGI2012 agent does this using Bayesian logic.On page 245, HC2019 warns against the temptation to use the power of AI to engineer the preferences of humans. I wholeheartedly agree, as reflected in my recent writings and talks. Given an AI agent that acts to create futures valued by (models of) current humans, it is an interesting question how current humans would value futures in which their values are changed.On pages 254-256, HC2019 warns of possible futures in which humans are so reliant on AI that they become enfeebled. Again, it is an interesting question how current humans would value futures in which they must overcome challenges versus futures in which they face no challenges.On page 252, HC2019 says, "Regulation of any kind is strenuously opposed in the [Silicon] Valley," and on page 249 it says that "three hundred separate efforts to develop ethical principles for AI" have been identified. I believe one goal of these AI ethics efforts is to substitute voluntary for mandatory standards. Humanity needs mandatory standards. Most importantly, humanity needs developers to be transparent about how their AI systems work and what they are used for.(VIS1989) Hibbard, W., and Santek, D., 1989. Interactivity is the Key. Proc. Chapel Hill Workshop on Volume Visualization, pp. 39-43.(AGI2012) Hibbard, B. 2012. Avoiding unintended AI behaviors. In: Bach, J., and Ikle', M. (eds) AGI 2012. LNCS (LNAI), vol. 7716, pp. 107-116. Springer.(EAI2014) Hibbard, B. 2014. Ethical Artificial Intelligence. arXiv:1411.1373.(UAI2004) Hutter, M. 2004. Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability. Springer.(HC2019) Russell, S. 2019. Human Compatible: Artificial Intelligence and the Problem of Control. Viking.
F**C
Lo esperado
Bien
Trustpilot
2 months ago
3 days ago