Skip to main content

Text Mining and Analysis in the Humanities

This guide provides background information on data and text mining

A few examples of humanities text mining projects

The Digital Scholarship Lab at the University of Richmond "collects a number of interrelated projects on the sectional crisis, slavery, and emancipation during the Civil War era, with a particular emphasis on the histories of the city of Richmond and the state of Virginia" in the Hidden Patterns of the Civil War.  Some of their projects include:  Mining the Dispatch, Mapping Richmond's Slave Market, Voting America: Civil War Elections, and Visualizing Emancipation.

Mapping the Republic of Letters.   Stanford University's Mapping the Republic of Letters  is an ongoing effort to visualize certain....correspondence networks as a way of exploring a bundle of historical questions about the geographic range, diversity, and interactions among intellectuals during seventeenth, eighteenth, and nineteenth centuries.

The Modernist Journals Project:  "The main purpose of this site is to do cool things with the data that the Modernist Journals Project has generated over the course of digitizing magazines from the early 20th century. The site is experimental, but it's also dedicated to experimentation―playing with the MJP data, and drawing new patterns and knowledge out of its journal files."

Robots Reading Vogue.  "Few magazines can boast being continuously published for over a century, familiar and interesting to almost everyone, full of iconic pictures — and also completely digitized and marked up as both text and images. What can you do with over 2,700 covers, 400,000 pages, 6 TB of data? Students, librarians and faculty are excited about the possibilities of working with Vogue to explore questions in fields from gender studies to computer science. "

"The Stanford Literary Lab is a research collective that applies computational criticism, in all its forms, to the study of literature. The Lab is open to students and faculty at Stanford, and, on a more ad hoc basis, to those from other institutions."  Their current projects are profiled on their web site and are subsequently published in their pamphlet series.