Born digital big data and methods for history and the humanities

Elephant word cloud for big data

In recent years we have all become familiar with the notion of information overload, the digital deluge, the information explosion, and numerous variations on this idea. At the heart of this phenomenon is the growth of born-digital big data, a term which encompasses everything from aggregated tweets and Facebook posts to government emails, from the live and archived web to data generated by wearable and household technology. While there has been a growing interest in big data and the humanities in recent years, as exhibited notably in the AHRC's digital transformations theme, most academic research in this area has been undertaken by computer scientists and in emerging fields such as social informatics. As yet, there has been no systematic investigation of how humanities researchers are engaging with this new type of primary source, of what tools and methods they might require in order to work more effectively with big data in the future, and of what might constitute a specifically humanities approach to big data research. What kinds of questions will this data allow us to ask and answer? How can we ensure that this material is collected and preserved in such a way that it meets the requirements of humanities researchers? What insights can scholars in the humanities learn from groundbreaking work in the computer and social sciences, and from the archives and libraries who are concerned with securing all of this information?

This new research network will bring together researchers and practitioners from all of these stakeholder groups, to discern if there is a genuine humanities approach to born-digital data, and to establish how this might inform, complement and draw on other disciplines and practices. Over the course of three workshops, one to be held at The National Archives in Kew, one at the Institute of Historical Research, University of London, and one at the University of Cambridge, the network will address the current state of the field; establish the most appropriate tools and methods for humanities researchers for whom born-digital material is an important primary source; discuss the ways in which researchers and archives can work together to facilitate big data research; identify the barriers to engagement with big data, particularly in relation to skills; and work to build an engaged and lasting community of interest. The focus of the network will be on history, but it will also encompass other humanities and social science disciplines. The network will also include representatives of non-humanities disciplines, including the computer, social and information sciences. Interdisciplinarity and collaborative working are essential to digital research, and particularly in such a new and complex area of investigation.

During the 12 months of the project all members of the network will contribute to a web resource, which will present key themes and ideas to both an academic and wider audience of the interested general public. External experts from government, the media and other relevant sectors will also be invited to contribute, to ensure that the network takes account of a range of opinions and needs. The exchange of knowledge and experience that takes place at the workshops will also be distilled into a white paper, which will be published under a CC-BY licence in month 12 of the network.

  • Principal Investigator: Jane Winters (Institute of Historical Research)
  • Co-Investigator: Tobias Blanke (King's College London)
  • Main Partners: The National Archives of the UK (Sonia Ranade), University of Cambridge (Anne Alexander), University of Sussex (James Baker), University of Waterloo (Ian Milligan), British Library (Jason Webber and Jonathan Pledge), Webster Research and Consulting (Peter Webster)