The History Lab's mission is to use data science to recover and repair the fabric of the past. We are beginning with declassified documents, which include some of the earliest examples of electronic records. By bringing together fragmented collections in a common database, we can use natural language processing and machine learning tools to explore them. The ultimate goal is to develop history as a data science so that citizens can keep the government accountable in the age of big data and AI.
Our multidisciplinary team of researchers has gathered nearly 5 million documents, comprising over 18 million pages, to create the Freedom of Information Archive (FOIArchive), the world's largest database of declassified government records.
Recent News
Columbia University's History Lab Unveils Revolutionary AI-Powered Archive of 5 Million Declassified Documents
First conversational AI interface for exploring decades of CIA, State Department, and
intelligence records launches at Federal Depository Library Conference
R and Stata interfaces to History Lab API
New releases of R and Stata packages to interface with History Lab's API.