Photo: Micah Williams/Unsplash
Since the very beginning of the computer revolution, researchers have dreamed of creating computers that would rival the human brain. Our brains are information machines that use inputs to generate outputs, and so are computers. How hard could it be to build computers that work as well as our brains?
In 1954 a Georgetown University-IBM team predicted that language translation programs would be perfected in three to five years. In 1965, cognitive psychologist Herbert Simon said that “machines will be capable, within twenty years, of doing any work a man can do.” In 1970, Marvin Minsky told Life magazine, “In from three to eight years we will have a machine with the general intelligence of an average human being.” Billions of dollars have been poured into efforts to build computers with artificial intelligence that equals or surpasses human intelligence. Researchers didn’t know it at first, but this was a moonshot – a wildly ambitious effort that had little chance of a quick payoff.
So far, it has failed. We still know very little about how the human brain works, but we have learned that building computers that rival human brains is not just a question of computational power and clever code.
AI research was launched at a summer conference at Dartmouth College in 1956 with the moonshot vision that “every aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate it.” Seventeen years later, the 1973 Lighthill report commissioned by the UK Science Research Council concluded that “in no part of the field have the discoveries made so far produced the major impact that was then promised.” Funding dried up and an AI winter began. There was a resurgence of AI research in the 1980s, fueled by advances in computer memory and processing speed and the development of expert systems, followed by a second AI winter as the limitations of expert systems became apparent. Another resurgence began in the 1990s and continues to this day.
Widely publicised computer victories over world champions in backgammon, checkers, chess, Go and Jeopardy! have fueled the idea that the initial hopes for AI are on the verge of being realised. But just as in the first decades of moonshot hope, ambitious predictions and moving goalposts continue to be the norm.
In 2014, futurist Ray Kurzweil predicted that by 2029, computers will have human-level intelligence and will have all of the intellectual and emotional capabilities of humans, including “the ability to tell a joke, to be funny, to be romantic, to be loving, to be sexy.” As we move closer to 2029, Kurzweil talks more about 2045.
In a 2009 TED talk, Israeli neuroscientist Henry Markram said that within a decade his research group would reverse-engineer the human brain by using a supercomputer to simulate the brain’s 86 billion neurons and 100 trillion synapses.
These failed goals cost money. After being promised $1.3 billion in funding from the European Union, Markram’s Human Brain Project crashed in 2015. In 2016, the market research firm PwC predicted that GDP would be 14% or $15.7 trillion higher in 2030 because of AI products and services. They weren’t alone. McKinsey, Accenture and Forrester also forecast similar figures by 2030, with Forrester in 2016 predicting $1.2 trillion in 2020. Four years later, in 2020, Forrester reported that the AI market was only $17 billion. It now projects the market to reach $37 billion by 2025. Oops!
The $15 trillion predictions made in 2016 assumed the success of AI moonshots such as Watson for health care, DeepMind and Nest for energy use, Level 5 self-driving vehicles on public roads, and humanlike robots and text. When moonshots like these work, they can be revolutionary; when they turn out to be pie in the sky, the failures are costly.
We have learned the hard way that winning a game of Go or Jeopardy! is a lot easier than processing words and images, providing effective healthcare and building self-driving cars. Computers are like New Zealander Nigel Richards, who memorised the 386,000 words in the French Scrabble dictionary and won the French-language Scrabble World Championship twice, even though he doesn’t know the meaning of the French words he spells. In the same way, computer algorithms fit mathematical equations to data that they do not understand and consequently cannot employ any of the critical thinking skills that humans have.
If a computer algorithm found a correlation between Donald Trump tweeting the word ‘with’ and the price of tea in China four days later, it had no way of assessing whether this correlation is meaningful or meaningless. A state-of-the-art image recognition program was 99% certain that a series of horizontal black and yellow lines was a school bus, evidently focusing on the color of the pixels and completely ignoring the fact that buses have wheels, doors and a windshield.
The healthcare moonshot has also disappointed. Swayed by IBM’s Watson boasts, McKinsey predicted a 30-50% productivity improvement for nurses, a 5-9% reduction in health care costs, and health care savings in developed countries equal to up to 2 percent of GDP. The Wall Street Journal published a cautionary article in 2017, and soon others were questioning the hype. A 2019 article in IEEE Spectrum concluded that Watson had “overpromised and underdelivered.” Soon afterward, IBM pulled Watson from drug discovery, and media enthusiasm waned as bad news about AI health care accumulated. For example, a 2020 Mayo Clinic and Harvard survey of clinical staff who were using AI-based clinical decision support to improve glycaemic control in patients with diabetes gave the program a median score of 11 on a scale of 0 to 100, with only 14% saying that they would recommend the system to other clinics.
Following Watson’s failure, the media moved on to Google health care articles in Nature and other journals that reported black-box results with unreported tweaks that were needed to make the models work well. After Google published its protein folding paper, an expert in structural biology said, “Until DeepMind shares their code, nobody in the field cares and it’s just them patting themselves on the back.” He also said that the idea that protein folding had been solved was “laughable.” An international group of scientists described a Google paper on breast cancer as another “very high-profile journal publishing a very exciting study that has nothing to do with science. … It’s more an advertisement for cool technology. We can’t really do anything with it.”
Such cautions are well deserved in light of the flop of Google’s highly touted Flu Trends algorithm. After claiming to be 97.5% accurate in predicting flu outbreaks, Google Flu Trends overestimated the number of flu cases for 100 of the next 108 weeks, by an average of nearly 100% before being quietly retired. The self-driving vehicle moonshot is in a similar state. By late 2018, it was becoming clear that self-driving cars were much harder than originally thought, with one Wall Street Journal article titled ‘Driverless Hype Collides With Merciless Reality’. In 2020, startups like Zoox, Ike, Kodiak Robotics, Lyft, Uber and Velodyne began layoffs, bankruptcies, revaluations and liquidations at deflated prices. Uber sold its autonomous unit in late 2020 after years of claiming that self-driving vehicles were its key to future profitability. An MIT task force announced in mid-2020 that fully driverless systems will take at least a decade to deploy over large areas.
Overall, AI moonshots are proving to be an expensive collection of failures. An October 2020 Wired article titled ‘Companies Are Rushing to Use AI—but Few See a Payoff’ reported that only 11% of firms that have deployed AI are reaping a “sizable” return on their investments. One reason is that costs often turn out to be higher – much higher – than originally assumed. According to a fall 2020 MIT Sloan Management Review article, “A good rule of thumb is that you should estimate that for every $1 you spend developing an algorithm, you must spend $100 to deploy and support it.”
The 2020 edition of the ‘State of AI Report’, published by AI investors Nathan Benaich and Ian Hogarth, concluded that “we’re rapidly approaching outrageous computational, economic, and environmental costs to gain incrementally smaller improvements in model performance.” For example, “Without major new research breakthroughs, dropping the [image recognition] error rate from 11.5% to 1% would require over one hundred billion billion dollars!”
The fact is most moonshots fail: nuclear fusion, synthetic fuels, supersonic flight, maglev and blockchain for everything. Instead, successful technologies generally begin in small and often overlooked applications and then expand to bigger and more important ones. Transistors were first used in hearing aids and radios before becoming ubiquitous in military equipment, computers and phones. Computers began with accounting applications and later expanded to every function of a company. LEDs were first used in calculators and automobile dashboards, long before being used for lighting. The internet began as a tool for professors before becoming the most widely used technology since electricity. Solar cells were used in satellites and remote locations long before they were used to generate electricity for urban homes and business. In almost every case, technologies begin in a niche and then incrementally expand to other applications over decades through exponential improvements in price and performance.
Some companies successfully focus their AI efforts on solutions to small problems with achievable benefits. For instance, DHL uses AI-controlled robots to find packages, move them around warehouses and load them onto planes. And Microsoft recently acquired Nuance, a company best known for a deep-learning voice transcription service that is very popular in the healthcare sector.
Many similar examples can be found in robotic process automation – software robots that emulate humans interacting with digital systems. It can be used for accounting, manufacturing, financial, and engineering transactions, and it is the fastest-growing segment of the AI market.
The same incremental approach can be used for health care, self-driving vehicles and more. Mutually beneficial diffusion and progress can come from collaboration among large research hospitals within and across countries as researchers learn from one another and generalize from one case to another. The holy grail of a robotaxi that can operate without a driver in every geographic location no matter the weather remains elusive, but self-driving vehicles are used successfully in constrained environments like mining camps, large factories, industrial parks, theme parks, golf clubs, and university campuses. It is surely better to perfect small solutions before moving on to crowded public roads with a plethora of unforeseen hazards.
One of the reasons AI overpromised and underdelivered is that we didn’t anticipate that building a computer that surpasses the human brain is the moonshot of all moonshots. Computers may someday rival human intelligence. In the meantime, we should recognise the limitations of AI and take advantage of the real strengths of computers. The failure of AI moonshots is not a reason to give up on AI, but it is a reason to be realistic about what AI can do for us.
This piece was originally published on Future Tense, a partnership between Slate magazine, Arizona State University and New America.