Home » Projects » Dealing with Humanities Data

Dealing with Humanities Data

A Planned (and Failed) Collaboration Between Upper-Level Students in English, Computer Science, and Statistics Majors

Returning from the SAIL seminar, we (Mouton and Sowell) discovered the timing of The 19th Century Novel and Dealing with Data courses would happily coincide. We hatched a plan for a cross-collaborative project inspired by our experiences in Silicon Valley.

English students would develop big-data-driven research questions about nineteenth-century novels. CSS (Computer Science and Statistics) students would pursue at least one of these using data mining, cleaning, and visualization techniques. CSS students would present their findings to the English students at the end of the term.

Note: Content adapted from original curricular project

Students enroll in one class per term which meets full time for 3.5 weeks. English students (16 total) would pitch their data-driven research questions – along with CSS students (24 total) – to the data science class before the second week. CSS students would vote on the most interesting project ideas and pursue those in small groups over the remainder of the term. CSS students would present their findings to both classes at the end of the term.


Our learning objectives were distinct but overlapped.


English Students:

  • Introduction to Digital Humanities
  • Develop ability to ask new kinds of questions pertinent to literary history and freshening their engagement in literary history
  • Develop ability to explain literary research to non-experts
  • Encourage teamwork


Data Science Students:

  • Model working with “real-world” clients
  • Apply data science skills to address research questions in the humanities
  • Develop ability to explain technical material to non-experts

Encourage teamwork


Assignment Schedule


We introduced the English students to Digital Humanities by watching a brief, animated video (Big Date + Old History) and discussing the concept of “distant reading” (Moretti). English students then brainstormed a few project ideas to pitch by the end of the first week.

Resources & Materials

Big Data + History

Adam Crymble (history PhD candidate) presents his 2-minute DH dissertation in this contest-winning video.

“Part 1: Foundation”

Macro-Analysis: Digital Methods and Literary History

Matthew J. Jockers, U Illinois P: 2013. 1-34

A good introduction to DH methodologies.



Graphs, Maps, Trees: Abstract Models for Literary History

Franco Moretti, Verso: 2007

DH project examples related to British literary history.


“Developing Domains for Multimodal Writing”

Multimodal Assessment Project (MAP) Group

Digital Writing: Assessment and Evaluation

Heidi A. McKee and Danielle Nicole DeVoss

MAP Group provides considerations relevant to assessing multimodal student Projects.


“Reading by the Numbers, When Big Data Meets Literature”

Jennifer Schuessler, New York Times October 30, 2017

Provides overview with positive and negative assessments of DH as a field.

Outcomes and Significance


Unfortunately, CSS students did not vote for any of the projects pitched by the English students, so the collaboration did not materialize. The plan was a failure, but an instructive one that spurred the English students to tackle other DH projects later in the term. These, along with the data science course, are described in the following Project Outcomes:


(Link) Digital Humanities and Mary Shelley’s The Last Man: An Introductory Teaching Unit on Digital Humanities Analysis and the Nineteenth-Century Novel

(Link) Dealing with Data: A Collaborative, Project -Based Data Science Course for Computer Science and Statistics Students


Multiple factors doomed the collaboration from the outset:

  • English students had completed only part of a single novel by Wednesday, and it was difficult to formulate meaningful questions with so little experience in the content area.
  • In the first days of class, English students had insufficient time for the DH readings, brainstorming, and honing student pitches.
  • Neither the English students nor the English professor had sufficient programming knowledge to identify existing, publicly available datasets, and to develop ambitious but feasible questions.
  • The CSS students had no background in British literary history, and the English students did not know their audience.

We intend to try this again and anticipate a successful outcome with a few changes to the assignment. CSS and English students could design projects together in small groups from the outset. This would increase the CSS students’ investment in the project ideas and increase the English students’ understanding of what’s possible.

Given the time constraint, the instructors could identify and introduce available datasets to students in advance of their brainstorming, and provide detailed examples of project ideas for guidance. Moreover, the CS professor could participate in the English students’ brainstorming session to provide a conceptual overview of working with data, explain basic terminology (e.g., “data scraping” and “data cleaning”), and help them recognize technical possibilities and constraints.

CSS and English students should meet at least twice during the term to discuss progress. Finally, in addition to the CSS students presenting their final work to both classes, English students could present their “analog” research projects to both groups. The collaboration (and term) could thus conclude with a discussion of the benefits and limitations of working quantitatively with discreet data on Humanities (and other kinds of) research questions.

Meanwhile, the failed pitches turned out to be motivational for the English students to take on their own DH projects. These, and a description of the Dealing with Data course, follow.

Share this page