The History Lab's mission is to use data science to recover and repair the fabric of the past. We are beginning with declassified documents, which include some of the earliest examples of electronic records. By bringing together fragmented collections in a common database, we can use natural language processing and machine learning tools to explore them. The ultimate goal is to develop history as a data science so that citizens can keep the government accountable in the age of big data and AI.
Our multidisciplinary team of researchers has gathered nearly 5 million documents, comprising over 18 million pages, to create the Freedom of Information Archive (FOIArchive), the world's largest database of declassified government records.
Recent News
R and Stata interfaces to History Lab API
New releases of R and Stata packages to interface with History Lab's API.
FOIArchive on Hugging Face
Archives as Data - Summer Institute 2025
Please note that this workshop has been postponed indefinitely.
We are pleased to announce that we have opened the application process for our third edition of the NEH-funded Archives as Data Summer Institute. The Institute will run from June 2 to June 13, 2025, and will offer practical training for historians and archivists in processing and analyzing textual data.