Center forDigital Humanities

Digital Humanities Projects

Dirty History Crawler

Dirty History Crawler is a tool that discovers, analyzes and graphs networks of authorship and work in WorldCat, the union catalog of some two billion assets catalogued among library OPACS worldwide.


Although still in beta mode, the user can enter a list of authors on the web page. The program will search WorldCat for each author as either keyword or author (keyword searches are more advantageous for some authors). After searching WorldCat, the program creates Zotero library entries for each item returned, then it creates node and edge files for use with social networking programs like Gephi and a bi-modal sociomatrix for use in statistical analysis of the author's social print network. After completing these tasks for each author, the program deduplicates the results of all author queries and then creates an amalgamated social network of all queried authors. The user then receives the search results by email which include a zip file that contains a folder for each individual author and their social network files, and the amalgamated social network files. The sociomatrix is a particularly useful data format for social network analysis with programs like R.