There’s yet another online debate raging between world-renowned AI experts. This time it’s the big one: will AI rise up and murder us all? While this isn’t a new topic – humans have postulated about AI overlords for centuries – the timing and people involved in this debate make it interesting. We’re absolutely in the AI era now, and these dangers are no longer fictional. The architects of intelligence working on AI today could, potentially, be the ones who cause (or protect us from) an actual robot apocalypse. That makes what they have to say about the existential threat their work poses to our species pretty important. The debate isn’t about the general idea of killer robots. It’s about instrumental convergence. Stuart Russell, an expert whose resume includes a gig as a professor of computer science at Berkeley and one at UC San Francisco as an adjunct-professor of neurological surgery, explains it by imagining a robot designed to fetch coffee: This is, essentially, Nick Bostrom’s Paperclip Maximizer – build an AI that makes paperclips and it’ll eventually turn the whole world into a paperclip factory – but a coffee-fetcher works too. If, in that MDP, there is another “human” who has some probability, however small, of switching the agent off, and if the agent has available a button that switches off that human, the agent will necessarily press that button as part of the optimal solution for fetching the coffee. No hatred, no desire for power, no built-in emotions, no built-in survival instinct, nothing except the desire to fetch the coffee successfully. This point cannot be addressed because it’s a simple mathematical observation. Yann LeCun, Facebook’s AI guru, and the person who sparked the debate by co-writing an article telling everyone to stop worrying about killer robots, responded by laying out five escalating reasons why he disagrees with Stuart:
Once the robot has brought you coffee, its self-preservation instinct disappears. You can turn it off. One would have to be unbelievably stupid to build open-ended objectives in a super-intelligent (and super-powerful) machine without some safeguard terms in the objective. One would have to be rather incompetent not to have a mechanism by which new terms in the objective could be added to prevent previously-unforeseen bad behavior. For humans, we have education and laws to shape our objective functions and complement the hardwired terms built into us by evolution. The power of even the most super-intelligent machine is limited by physics, and its size and needs make it vulnerable to physical attacks. No need for much intelligence here. A virus is infinitely less intelligent than you, but it can still kill you. A second machine, designed solely to neutralize an evil super-intelligent machine will win every time, if given similar amounts of computing resources (because specialized machines always beat general ones).
Stuart, and others who agree with him, don’t see the problem the same way. They argue that, as with climate change, existential threats can arise from systems not inherently designed to be “harmful,” yet proper protocols may have prevented the problem in the first place. This makes sense, but, so does the alternative viewpoint posed in the book “Rebooting AI: Building Artificial Intelligence We Can Trust,” by NYU’s Gary Marcus and Ernest Davis: As with anything in the realm of science, whether we should be worried about existential threats like killer robots or focusing on immediate issues like bias and regulating AI depends on how you frame the question. How much time, energy, and other resources do you put into a problem that’s only theoretical and, by many expert estimates, has a very-close-to-zero chance of ever occurring? Read the entire debate here (huge tip of the hat to Ben Pace for putting it all-together in a single post!).