Machine Made History

The Declassification Engine Attempts Automatic Event Detection

Digital media create a torrent of data, and researchers have long sought to identify patterns that point to important events -- patterns that might even be predictable. Just think of how unusual “chatter” in a terrorist network can raise the threat level, or how companies measure what’s trending when making investment decisions. Social scientists have developed systems that attempt to automatically detect events in media reporting, whether political protests or inter-state conflict. But it turns out to be quite difficult, even when we know exactly what we are looking for. Recently researchers found zero overlap between an automated event detection system and events identified by humans -- not one of the events detected by the algorithm showed up on the list drawn up by people who reviewed news reporting, and vice versa. Moreover, these systems do not directly measure political activity, but rather what reporters happen to write about.

A deeper problem is that, even when we get to the second or third draft of history, experts disagree about what matters. Scholars have long debated the relative importance of dramatic events versus longer-term trends. Does a failed assassination or successful coup really send the world in a different direction, or do deeper, structural factors make some outcomes all but inevitable?

But the inherent complexity of assessing the importance of events -- or even defining what we mean by an event -- makes the prospect of a radically inductive method all the more compelling. If we could develop some means not only to automatically identify events, but to measure their relative significance, we might be able to test claims about which events really made a difference. We might also be able to determine whether and how these events are correlated with other quantifiable phenomena, including media reporting. Comparing the two might, for instance, show when policymakers are making news and when they are simply reacting to it.

But perhaps the best reason to attempt automatic event detection is simply that we do not always know exactly what we are looking for. Think of how much data future historians will have to deal with, like the approximately one billion emails the State Department is now generating every year. Decades from now, scholars might think they already know about the key events and trends, probably based on what they dimly remember, or have read in the media or memoirs. Even if they are right, how will they find the relevant records, if all they have is a few keywords and a search engine? Would it not be better to have some way to detect patterns and anomalies that reveal -- and rank order -- events that might not have gotten the attention they deserve?

To see what is possible with current technology, we set out to automatically identify events in a collection of 1.7 million U.S. diplomatic cables from the years 1973-1977 -- the years when the State Department first started to use electronic records systems. In this period, the cable was the main form of classified communication between Washington and hundreds of U.S. embassies and consulates around the world. “Traffic analysis” is a well-developed field of research, and we began experimenting with different models to discover what statisticians call “changepoints” in the communication streams.

It turns out that the State Department itself has long been using “Traffic Analysis by Geography and Subject” to track its own communications. Cables are assigned one or more of these “TAGS,” whether for the country it concerns or a range of subjects -- from refugees to embassy evacuations to the UN General Assembly. And since we are only using metadata -- basically just dates and TAGS -- we can include 330,000 cables where the text is still secret.

There were still many challenges in creating this system. Early on, we decided to focus on the dates that witnessed increased traffic about this or that subject. These are the times and places when history started to accelerate. But communications streams can accelerate or decelerate from day to day for reasons that have nothing to do with historical events. There are weekends and holidays, most obviously, which see less than a tenth as many cables as a typical weekday. There are also countless spikes that might represent nothing more interesting than a foreign service officer clearing out the inbox before leaving on vacation. If we graph the distribution, it’s clear how Fridays -- the “5s” along the top -- tend to be the busiest day, and there’s little happening on Saturdays and Sundays -- here numbered as 6 and 0.

A team of three Columbia statisticians comprising graduate students Yuanjun Gao and Jonathan Goetz and Assistant Professor Rahul Mazumder therefore had to develop robust statistical models that could identify the true underlying signals by adjusting for various forms of variability and noise in the data. An important component of the approach involved fitting localized probability models to the proportion of times a particular TAG appeared in the different cables, where the models were allowed to have a certain number of “change points” or ``structural breaks” as estimated from the data. The statistical algorithms can automatically identify periods in which there was a sustained “burst” of activity above the normal baseline, which corresponds to no unusual activity. Along with point estimates of most likely dates, the algorithm also provides a measure of uncertainty as to when these bursts begin and end. The method also returns an estimate of the peak date---when there was the largest number of cables relative to average activity for a particular country or subject. The statistical procedure is flexible: it can estimate bursts of various intensities, depending upon the granularity at which the historian wishes to visualize and interpret the results.

A simple version of the method seeks to estimate parameters that optimize the following regularized likelihood criterion:

Regularized formula for the likelihood criterion

where, the first term corresponds to the probability model to explain the data

and the parameters are

the probability models are encouraged to be localized via the second term which is also known as a regularizer in the statistical parlance.

Once they had developed this model, Rahul, Yuanjun, and Jack used Columbia’s High Performance Computing Cluster to compute the results. With some 2,600 cores, it took less than an hour to process 1.7 million cables and produce a rank ordered list of almost five hundred “bursts” from 1973-1977. Here are the top ten, along with the events they coincide with -- to the extent it is possible to identify a specific event.

Event: ?
Start: July 1, 1973
Peak: June 28, 1974
End: January 3, 1975
State Department TAGS: Consular Affairs-Visas
Weight: 5176

Event: ?
Start: July 1, 1973
Peak: September 28, 1973
End: August 9, 1974
State Department TAGS: Economic Affairs-Transportation
Weight: 5078

Event: Carter Prioritizes Human Rights
Start: January 19, 1977
Peak: November 18, 1977
End: December 31, 1977
State Department TAGS: Social Affairs-Human Rights
Weight: 2951

Event: Sadat Visits Isreal
Start: June 8, 1977
Peak: November 18, 1977
End: December 31, 1977
State Department TAGS: Political Affairs-Government
Weight: 2634

Event: The Primacy of Domestic Politics?
Start: February 12, 1977
Peak: May 20, 1977
End: June 25, 1977
State Department TAGS: United States
Weight: 2126

Event: Fall of South Vietnam
Start: January 4, 1975
Peak: April 25, 1975
End: October 19, 1975
State Department TAGS: Vietnam (South)
Weight: 2080

Event: Southeast Asian "Boat People" Crisis
Start: June 21, 1977
Peak: October 12, 1977
End: December 31, 1977
State Department TAGS: Vietnam
Weight: 2068

Event: Conclusion of the Panama Canal Treaty
Start: June 9, 1977
Peak: September 2, 1977
End: December 31, 1977
State Department TAGS: Diplomatic and Consular Representation
Weight: 2007

Event: Yom Kippur War
Start: October 7, 1973
Peak: October 16, 1973
End: March 18, 1974
State Department TAGS: Middle East
Weight: 1969

Event: Greek-Turkish Conflict over Cyprus Coup
Start: July 20, 1974
Peak: July 20, 1974
End: August 25, 1974
State Department TAGS: Cyprus
Weight: 1868

Event: Portugal withdraws from Angola
Start: November 8, 1975
Peak: November 10, 1975
End: February 21, 1976
State Department TAGS: Angola
Weight: 1696

Event: South Africa Arms embargo, ILO pullout
Start: June 15, 1977
Peak: November 11, 1977
End: December 31, 1977
State Department TAGS: Relations with Internal Organization
Weight: 1651

Like the supercomputer in The Hitchhikers Guide to the Galaxy, which calculated that the answer to “The Ultimate Question of Life, the Universe, and Everything" was precisely 42, some of these results are hard to interpret. Put simply, the ranking by “weight” reflects what proportion of all cables about this or that country or subject was sent during a period with sustained activity above the baseline. But this does not always reflect some dramatic event, especially since some of these “bursts” -- like the first one, for CVIS -- begin when the dataset begins and go on for more than a year. The model has a hard time distinguishing this kind of activity that starts at a high level and diminishes, like visa applications, from the rise and fall of real events. Similarly the second burst is made up of TAGS, “ETRN,” that were commonly used, and overused, from when we begin to have records continuing right up until 1974. To the model, both of these look like bursts, but they simply reflect administrative procedures, like a proclivity to use particular TAGS until they are overused, rather than real history.

So here is a corrected list that drops these two bursts.

What Statistical Modeling and High Performance Computing Rank as The Top Ten Events in U.S. Foreign Policy, 1973-1977

1. Human Rights (Briefly) Becomes a Priority for US Foreign Policy

President Jimmy Carter meeting with Shah Reza Pahlavi of Iran, black & white

This burst begins the day Jimmy Carter arrived in Washington to become President. He famously promised to make respect for human rights a priority for his administration. Ironically, the burst peaks in November 1977, the same week Carter hosted the Shah of Iran at the White House amidst tear gas and tumultuous demonstrations, (infamously) praising him as “an island of stability.”

2. Sadat Goes To Jerusalem

The peak of the second burst is made up of dozens of cables about Anwar Sadat’s decision to go to Israel, the first such visit by an Arab head of state.

3. The Continuing Primacy of Internal Politics?

The next is harder to interpret, since all of the cables in some sense concern the U.S. More than 400,000 have “US” TAGS, so any increase in relative frequency, even slight, is enough to overwhelm the model. Historians have long debated to what extent domestic politics drives foreign policy, and statistical analysis shows how much diplomacy consists of communications about what’s happening back home.

4. America Loses a War (April 1975)

But the fourth burst is unmistakably about the fall of South Vietnam. It starts two days before a disastrous defeat in January 1975, when the North Vietnamese took control of Phuoc Long province, and peaks when the U.S. was preparing a massive helicopter evacuation of Americans from Saigon.

5. The People of Vietnam Suffer Much More

The fall of South Vietnam led to a refugee crisis that continued for years afterward, the fifth burst, as hundreds of thousands of people took to the seas to flee Communist rule.

6. The US Gives Up Control of the Panama Canal (September 1977)

Political cartoon depicting Jimmy Carter's decision to give up control of the Panama Canal

Though largely forgotten now, the decision to give up control of the Panama Canal was one of the most controversial decisions of the Carter presidency. The peak of the sixth burst consists of dozens of cables concerning which countries would be represented at the signing ceremony five days later.

7. The Yom Kippur War (October 1973)

The start of the seventh burst, the Yom Kippur War, came with almost no warning, and the peak occurred when the U.S. began to airlift weapons to the beleaguered Israelis.

8. Turkey invades Cyprus

The eighth burst is even more dramatic, as it peaks the same day it starts, when Turkey invades Cyprus to stop a takeover by the Greek military. This was a nightmare for Washington, since both countries were NATO allies. Cyprus remains divided to this day.

9. Portugal Pulls out of Angola, the End of European Empires

Portuguese soldiers in the Portuguese Colonial War in Angola

The ninth burst marks the end of Portuguese rule in Angola amidst civil conflict in both countries. This was Lisbon’s first and last colony in Africa, the effective end of European empires.

10. The U.N. Security Council Imposes an Arms Embargo against South Africa, and...

A photo of the United Nations Security Council

The tenth and last burst came about because of a series of overlapping events involving the United Nations, including the U.S. withdrawal from the International Labor Organization and a unanimous Security Council vote imposing an arms embargo on South Africa. For now, there is no easy way to disaggregate cables that concern different events.

To evaluate our results, we reached out to Daniel Sargent, a Berkeley historian who has just written an acclaimed book about U.S. foreign policy in the 1970s. Before showing him our work, we asked Daniel to create his own rank-ordered list of the most important events of this period.

Man-Made History

Daniel Sargent

Ranking isn’t something I’ve had much occasion to do as a historian. Several years back, I participated in a poll to rank the secretaries of state. But ranking historical events by significance was a new kind of challenge. The first problem was the criteria: how to define significance? This could be assessed in terms of contemporary impact: the column inches and diplomatic consternation that events generated in their own times. I thought this approach insufficient. Unlike the perspective of the immediate moment, hindsight brings the opportunity to reevaluate the significance of particular events in relation to broader historical processes, longer chains of causation, and bigger frameworks of meaning. As I proceeded, then, I did not try to shed the insight that hindsight has to offer. Rather, I set out, quite self-consciously, to rank the events in the mid-1970s in terms of their significance for subsequent developments.

I began with my own recently-published history of US foreign relations in the 1970s. Skimming the chapters on the mid-1970s, I generated a list of events that I thought merited inclusion. Next, I went to the digital database of primary sources on which this book was based. Comprising about 8,700 documents, this database is text-searchable and, crucially for this exercise, sortable by various criteria, including date. To supplement my initial list, I sorted my entire document collection by date and skimmed the list, making note of events that my initial, book-based had overlooked. I should at this point explain the origins of the database. When I embarked upon the research project that became my book, I did so not with a specific topic but with broad questions: how were world politics changing in the 1970s; how did US decision makers perceive change in their international environment; how did they respond to it? To engage these questions, I read through a variety of long archival series. These included transcripts of Henry Kissinger’s telephone conversation and meetings; series of documents that crossed presidential desks; and the weekly national security reports that Zbigniew Brzezinski provided to President Carter from 1977-1981. There were, to be sure, gaps and holes in these long file series, often where documents had been redacted and withdrawn. Perusing these long series nonetheless gave me opportunity to assess the significance of historical events through the eyes of decision makers, not those of subsequent historians. This research and the book that I based upon it was the basis for my initial list, which comprised some 49 events between January 1, 1973 and December 31, 1977.

Once I had my list, I had to rank it in order of significance. This was perhaps the trickiest part of the exercise, for here there were few objective criteria to guide my choices. My choices reflect my view that the 1970s, the mid-1970s in particular, saw a range of novel concerns intrude upon the making of US foreign policy, marginalizing, at least to some extent, established Cold War priorities. These new concerns included the rise of economic interdependence, which we’ve since come to call globalization; the energy crises that emanated from the Middle East and challenged the prosperity of industrial societies; and the idea of human rights, which wasn’t new in the 1970s but which activists and advocates batted around with renewed vigor. Others may disagree, but these are my interpretative biases. With that caveat, I’ll proceed to explain my choices:

1. October War in the Middle East (October 1973)

The major turning point of the 1970s. Having presumed that the Third Arab-Israeli War to be a Cold War proxy conflict, American decision makers found themselves confounded in October 1973, confronting a united Arab front and an oil embargo. The war nonetheless transformed the Middle East’s geopolitics with beneficial consequences for the United States. Egypt switched sides to become a US client, and Kissinger managed to marginalize Soviet influence in the region while repairing relations with the Arab world.

2. G-77 unveils demand for NIEO (April 1974)

A relatively obscure choice, the proclamation of the New International Economic Order initiated an ideological clash between the West and an insurgent Third World. Eager to emulate OPEC, the developing countries demanded the creation of an international system of cartels to bolster the prices of commodities, a world food reserve system, and wealth redistribution on a global scale. While the Third World’s demands for a new international order have faded since the 1970s, the North-South conflict defined global geopolitics for much of the decade.

3. Rambouillet summit of the G-5 (November 1975)

While the industrialized countries met, initially, a Group of Five, Rambouillet initiated the G-7 summits that continue to the present day. These meetings provide a forum for dialogue, not collaborative governance, but the mere fact of the summits recognizes acknowledges that foreign policy, in an era of complex interdependence, is not just a matter of diplomacy. Monetary policy, fiscal policy, energy policy and other areas heretofore considered “domestic” have since the the mid-1970s been on the foreign-policy agenda.

4. OPEC decides to implement major new hike in oil prices (October 1973)

The oil price hikes of 1973/74 are commonly misunderstood to have been a consequence of the October War. In fact, the remaking of the oil price regime began several years before the war and continued after it. The underlying cause was America’s loss of primacy as “swing” producer to the world market and the assumption of that role–and with it the power to set prices—by the oil producers of the Middle East. The consequences would include the worst recession of the postwar era, which incentivized the rise of energy-efficient industries in the West, and a long energy bonanza in the oil-exporting USSR, a bonanza that lasted until the mid-1980s…

5. London Summit of G-7 (May 1977)

Listed here not for what it achieved—but for what it failed to achieve. The Carter administration sought, at first, to redress the international economic difficulties of the 1970s through the orchestration of a traditional Keynesian stimulus on an international scale. This approach debuted at the London Summit of the G-7 in 1977 and culminated in the Bonn Summit in 1978. Ultimately, it failed, and the United States and its allies ended up adopting monetarist solutions, not reflating their way to prosperity, as the Keynesians had hoped.

6. Final Act of the CSCE (August 1975)

Reviled in its own time, the Final Act of the Conference on Security and Cooperation in Europe (CSCE) culminated the years-long CSCE process and yielded the first pan-European security treaty of the postwar era. Also known as the Helsinki Final Act, the agreement confirmed the stabilization of Cold War rivalries in an era of détente, but it also helped to introduce the language of human rights to the East Bloc, with consequences that would prove transformative.

7. China – downfall of the Gang of Four (Fall 1976)

The political developments in China that followed upon the September 1976 death of Mao Zedong barely registered with Americans at the time. Yet the struggles to define China’s post-Mao future that antedated and followed Mao’s death would have vast consequences for China and the world. The moderates of course defeated the radicals, arresting the Gang of Four that included Mao’s widow Jiang Qing in October 1976. The Communist Party made a decisive choice for moderation and reform two years late (at the famous Third Plenum of the 11th Central Committee), but intense fear of the revolutionary forces that Jiang represented continues to haunt China’s leaders to the present.

8. Carter-Torrijos Treaties on Panama Canal (September 1977)

Jimmy Carter’s attempt to rectify the sins of an imperial past. Believing that the United States had “cheated the Panamanians out of their canal” in the first place, Carter undertook to repair a long-running sore in US-Latin American relations. Carter’s adept diplomacy and his masterful management of the Canal Treaties progress through Congress stand as a forceful rebuke to critics who challenge his record on foreign policy. Without Carter’s diplomatic resolution, Panama might well have attempted to take the Canal Zone by force, which would have precipitated a serious crisis in US relations with Latin America.

9. US presidential election (November 1976)

Elections matter for foreign policy, some more than others. While Ford and Carter agreed on much, the election manifested underlying divergences of purpose. Ford aimed to sustain the international stability that he, Richard Nixon, and Henry Kissinger had built. Carter envisaged a new leadership role for a globalizing world, including proactive promotion of human rights, a signal commitment for the new administration.

10. Carter initiates PRM on human rights in US foreign policy (May 1977)

For the first year, Carter’s human rights policy lacked clear and programmatic guidance. What are human rights? What should the United States do when foreign countries violate them? To answer these (and other) questions, Carter in 1977 initiated the first presidential study of human rights in foreign policy: NSC/Presidential Review Memorandum 28. The exercise generated a capacious policy: agenda the United States would embrace a broad definition of human rights, including social and economic as well as civil and political rights. Achieving this in practice proved more difficult, but the exercise nonetheless confirmed the arrival of human rights promotion as a US foreign policy objective.

Comparing and contrasting my top ten with the results that the History Lab generated, I feel a certain relief. For all the differences, which are substantive, our conclusions are not so far removed as I’d feared. Clearly, some of the events that I identified, such as China’s post-Mao transition, become significant on a world-historical scale only in the light of subsequent events. I also omitted events that the History Lab ranked highly, often because I subsumed these events within broader chains of interconnected developments. Sadat’s November 1977 announcement of his willingness to visit Jerusalem and his trip to Israel shortly thereafter represented a monumental turning point in the diplomacy of Middle Eastern peace. Sadat prompted the Carter administration to abandon its efforts to build a multilateral peace settlement that would include the Palestinians and precipitated, instead, the march to Camp David where a bilateral Egyptian-Israeli Peace was formalized in September 1978 and then consecrated, in treaty form, the following March. Camp David would have ranked very high on my list; I did not include Sadat’s announcement or trip, because I subsumed both within a broader Camp David process. Similar logic led me to downgrade Saigon’s fall (which I ranked 31/49 on my long-short list), reasoning that this episode marked the culmination of an old conflict whose resolution was already inevitable, not a decisive moment in a malleable history.

More broadly, however, the History Lab’s conclusions align with my broader interpretative project: which is to rethink the 1970s not just as a phase of détente in a long-running Cold War but as a phase when a variety of novel (even post-Cold War) challenges intruded on the foreign-policy landscape and dominated the foreign-policy horizon. This basic point, the History Lab’s data bears out well. Indeed, the Lab’s data—and the model that it has built—raise crucial questions for historians about how we should evaluate “significance.” Is that most vital quality to be assessed only in perspective of hindsight, or can we use quantitative aggregation of contemporary data to achieve novel perspective? From my vantage, the opportunities for dialogue between these two contrasting methodologies appear to be fruitful.

Final Thoughts:
Rise of the Cyborgs?

We at History Lab were also nervous about how the model would work, or not work, especially when matched against a Harvard Ph.D. who spent more than ten years researching the subject. Obviously, we cheated by throwing out two non-events, and Daniel was generous in not calling us on it (Kasparov would not have given Deep Blue two do-overs.) We still had two not-really-events, and the last one shows how all of this work still depends on some degree of human interpretation. Even for the most clear-cut and dramatic events, like the Yom Kippur War, there were cables concerning other things, mundane things, on the peak day of the crisis. But the model counts all of them. This doubtless inflates certain bursts, because the cables about the event in question share TAGS with other, unrelated events. And the conflation of cables also obscures other events that might have been no less important, but did not coincide with things that might have boosted the count.

So some of the events lower down on Daniel’s list are even lower down on our list -- he ranks the coup that toppled Salvador Allende in Chile at 18, and we have it at 41. But it’s still quite a dramatic spike:

Similarly, he puts the Washington Energy Conference -- Kissinger’s attempt to rally the energy-consuming countries against OPEC -- at 14, whereas according to our model it’s 26. Conversely, he has the Angolan civil war and the Portuguese Revolution at 16 and 19, while we put them both at 10, mainly because the cables on these two distinct but related events appear to add up to one big burst.

So what to do? We have made many refinements to this model just to get to this point, and we will likely make many more. This could include using other “features” in the data. So far we have looked into the classification level and handling instructions, and how cables about certain kinds of events might follow particular distribution patterns. We have also tried filtering the data, so as to reduce or eliminate “spam” cables. And we have started to develop a taxonomy of different burst formations. Each of these refinements involves trade-offs, and we could not begin to choose between them without reading -- and interpreting -- a lot of documents. Talking with experts like Daniel is essential to understand what we are missing. But “the machine” has depended all along on humans who know a lot of history, including scholars at both Columbia and the London School of Economics (notably David Allen and Markus Droemann.)

We first tested our model “out of sample” -- i.e. to see how it does with different data -- when we added 1977 cables to our first dataset. Area specialists determined that it made for even more interesting results. We will soon add the cables from 1978, which itself will again reshuffle our results (currently the top bursts cluster in 1977, in part because that’s when the data ends, and for the model that looks like the end of a burst.) We also need to test it against completely different and sparser kinds of data, like the Kissinger telephone transcripts and the Clinton emails. If it works, we will turn it into a tool with which users will be able to data-mine all of our document collections.

But even with the biggest supercomputer and an army of statisticians and historians, it’s highly unlikely that we will ever come up with a fully automatic way to mine archives and rank order events. But that’s not the point. The goal here instead is to develop technology that can help people meet the challenge of exponentially larger archives and do what machines can never do: interpret complex data, assess causal relationships, and determine what this history really means for the present and the future.