The examined spy
New insights that could make national intelligence smarter
The American intelligence system is the world’s most sophisticated surveillance network, costing $80 billion a year with 200,000 operatives covering the mapped world, but its work can read like a grand comedy of errors. Notwithstanding all the things America’s spies get right, it’s hard to ignore the unending parade of major, world-changing events that they missed. There was the failure to connect the warnings before 9/11; the false guarantee that Saddam Hussein had weapons of mass destruction; the assured stability of Arab autocrats up until this January.
In the last decade, the network has only continued to grow, with a key role in informing decisions of war and peace, and the near impossible task of preventing another terrorist attack on American soil. With so much at stake, you would assume the intelligence community rigorously tests its methods, constantly honing and adjusting how it goes about the inherently imprecise task of predicting the future in a secretive, constantly shifting world.
You’d be wrong.
In a field still deeply shaped by arcane traditions and turf wars, when it comes to assessing what actually works — and which tidbits of information make it into the president’s daily brief — politics and power struggles among the 17 different American
intelligence agencies are just as likely as security concerns to rule the day.
What if the intelligence community started to apply the emerging tools of social science to its work? What if it began testing and refining its predictions to determine which of its techniques yield useful information, and which should be discarded? Director of National Intelligence James R. Clapper, a retired Air Force general, has begun to invite this kind of thinking from the heart of the leviathan. He has asked outside experts to assess the intelligence community’s methods; at the same time, the government has begun directing some of its prodigious intelligence budget to academic research to explore pie-in-the-sky approaches to forecasting. All this effort is intended to transform America’s massive data-collection effort into much more accurate analysis and predictions.
“We still don’t really know what works and what doesn’t work,” said Baruch Fischhoff, a behavioral scientist at Carnegie Mellon University. “We say, put it to the test. The stakes are so high, how can you afford not to structure yourself for learning?”
This idea appears to be gaining traction among the nation’s intelligence leaders. Fischhoff chaired a panel of behavioral scientists who issued a 102-page report in March, commissioned by Clapper, about specific ways their field could help America’s top spies do a better job. Multidisciplinary teams of social scientists have already started working with the CIA and other American intelligence agencies to design more accurate ways of predicting what might happen in unstable or impenetrable places like the Korean demilitarized zone or the hinterlands of Central Asia.
Some of their new suggestions are simple, such as urging authors of intelligence reports to adopt precise and uniform language when they make predictions. Some of them are intriguing uses of tools that have already caught on in the worlds of business and science, such as crowdsourcing for more accurate analysis and collaborating in ways that minimize errors and improve accuracy.
Their work has only just begun, and not all their findings will survive the culture clash already breaking out between the academics and their new “clients” in America’s spy bureaucracy. But if it works, it will represent a long overdue shift: an acknowledgement that spying in the information age is not some black art, purely the product of intuition and creative intellect. It’s a systematic discipline, which could benefit from a scientific retooling. Recent misses remind us how much depends on getting it right.
In history and its own mythology, spying is a kind of a rogue trade, one built on independent thinking, boundless physical courage, and a bent for imaginative improvisation. Today, intelligence has become far more technical and deeply bureaucratic. The heart of contemporary spying isn’t secret agents riding across the steppes of Central Asia: It’s surveillance satellites and supercomputers sifting through e-mails, phone calls, and other signals. Spying originated from the need to fill in absent information, with intuition playing a key role; today’s central problem often comes down to too much information, and no one who can understand it.
Other fields that depend on accurate data analysis have in recent decades made a slow but effective shift away from intuition to evidence-based professional cultures. In medicine, it wasn’t that long ago that doctors insisted their gut instincts should control the diagnostic process. But as data have come in showing that diagnostic algorithms could be more accurate than clinical instincts alone, and that some treatments didn’t work as doctors thought, the profession has grudgingly embraced a change in thinking.
Fischhoff and a who’s who of social scientists from psychology, business, and policy departments hope to foment a similar revolution in the intelligence world. Their most radical suggestion could have far-reaching effects and is already being slowly implemented: systematically judge the success rates of analyst predictions, and figure out which approaches actually work. Is intuition more useful than computer modeling? Is game theory better for some situations, and on-the-ground social analysis more accurate elsewhere?
Fischhoff envisions intelligence agencies, in real time, assigning teams with very different approaches to separately analyze real world situations, like the current state of play in Syria and the wider Arab world. Over the course of the next couple of years, researchers would track the success of different approaches to see which methods work best.
That remains only a proposal so far, but the Intelligence Advanced Research Projects Activity, or IARPA — a two-year-old agency that funds experimental ideas — is already trying a novel way to generate imaginative new steps to make predictions better. It is funding an unusual contest among academic researchers, a forecasting competition that will pit five teams using different methods of prediction against one another. If they come up with new methods that work better than the old, intelligence analysts could adopt them.
Other approaches are less sweeping, and take a more granular look at the problems that arise when intelligence analysts talk to one another. Linguists and psychologists have homed in on the vagueness of the language intelligence analysts use. Terms such as “likely” and “possible” are common in intelligence reports, but mean different things to different people, and even different things to the same people in different contexts. National Intelligence Estimates already have a key that assigns specific probability ranges to commonly used terms such as “likely” and “probably,” but the social scientists who designed the system say there’s little evidence that analysts actually follow it. Instead, they would like intelligence analysts to phrase their claims in quantifiable, verifiable language — for example, there’s an 80 percent chance that Kim Jong Il is succeeded by his son, or there’s a 50 percent chance that Bashar al-Assad’s regime collapses within the next 12 months.
Furthermore, they want analysts to grade how confident they are in their predictions and their source material. The scientists say assigning confidence levels would force analysts to acknowledge in a systematic way that some predictions and insights are more solid than others. They also say that numerical language will reduce misunderstandings between spies and the policy makers who use their reports — and, over time, it will encourage individual analysts to improve their own records.
One final recommendation builds on an idea that has already paid off in numerous realms: crowdsourcing. Social scientists have found that combining many people’s predictions — even if they are not experts — usually yields better results than any single person’s judgment. With 200,000 people in its direct employ, and nearly 1 million outsiders holding security clearances, the US intelligence network would seem to be a perfect place to take advantage of the wisdom of the herd — except that so many of these people work in compartmentalized and secretive units. To solve this problem, IARPA is awarding grants to teams of social scientists to craft the best approaches to pooling multiple sources of intelligence analysis, generating more accurate predictions than individual departments might manage on their own.
There’s been pushback to these suggestions from the spy fraternity. Intelligence methods draw on deep traditions that aren’t easily unseated, and plenty of data collection is already organized by serious scientists who aren’t necessarily friendly to new ideas from psychology and linguistics.
Some of the objections are more pragmatic: Presenting human judgments as numerical probabilities, say critics within the intelligence community, amounts to false precision, turning a naturally fuzzy discipline into a misleading pseudoscience. And the benefits of a slow statistical improvement could be wiped out by one high-profile failure: Who would be pleased if analysts scored better on a raft of political and security predictions, but failed to stop a terrorist plot on American soil?
Robert L. Hutchings, who chaired the National Intelligence Council from 2003 to 2005 and is now dean of the Lyndon B Johnson School of Public Affairs at the University of Texas, worries that social science is simply the wrong tool for the job.
“The claims social scientists make for their work are often extravagant,” he said. “They are accustomed to examining and explaining past events; when it comes to anticipating future developments, their track record is no better — and is usually much worse — than that of government insiders, who have much keener awareness of the contingent aspects of history.”
Social scientists, however, argue that even basic practices normal in science would help a lot, even if they’re not a magic bullet; many of the social scientists invited to study the field by the director of national intelligence were stunned to learn how much of the intelligence community’s resources went into collecting information, and how little went into analyzing it. Just changing that ratio could go a long way.
“It’s not as if we’ve been driving blind, but we could do a much better job,” said Philip Tetlock, an expert on errors in political judgment at the University of Pennsylvania, who is also leading one of the forecasting groups.
As for how hidebound agencies like the CIA and NSA will turn out to be, there’s a seminal generational change afoot in the intelligence world. About two-thirds of the analysts at America’s 17 official intelligence agencies have begun their careers in the decade since 9/11 — a pivot point that provoked profound self-criticism among the spies who had information about the impending attacks but failed to connect the dots. “The intelligence community is more receptive than they’ve ever been to the developments in the behavioral sciences,” says Hal Arkes, a judgment and decision-making researcher at Ohio State University.
No one knows how much better intelligence analysts could do; that’s another question that has yet to be studied. “That’s the $64 million question: How much improvement can we expect?” Tetlock said. “It’s a question of how close you think we are to the optimal forecasting frontier.”
Thanassis Cambanis is the author of ”A Privilege to Die: Inside Hezbollah’s Legions and Their Endless War Against Israel” and blogs at thanassiscambanis.com. He is an Ideas columnist.