A number of research organizations and publishers of large digital archives are making their texts and metadata available for text mining and analysis. The following are some examples. Contact your subject librarian for further information.
EEBO-TCP Early English Books Online - Text Creation Partnership. 25,000 texts from the first phase of EEBO-TCP were made freely available as open data in the public domain from January 2015.
Gale Digital Collections Includes the 17th & 18th Century Burney Collections Newspapers, 19th Century British Library Newspapers, 19th Century UK Periodicals, the Economist Historical Archive 1843-2011, Eighteenth Century Collections Online, Sabin Americana, 1500-1926, The Times Digital Archive, 1785-1985, and others. Consult the FAQ for more information about textmining in the Gale Collections.
HathiTrust Data Sets - HathiTrust makes the texts of public domain works in its corpus available for research purposes. "HathiTrust announces the release of a significantly expanded open dataset, the HathiTrust Research Center (HTRC) Extracted Features (EF) Dataset <https://analytics.hathitrust.
Proceedings of the Old Bailey: London's Central Criminal Court, 1674-1913Criminal. The Old Bailey API allows you to work directly with the text of both the individual trials and sessions published as part of the Proceedings. You can either use the Old Bailey API Demonstrator to build queries and export texts to Voyant Tools; or else address the underlying text directly through the API.