In 1993, I took part in a fascinating experiment. Professor Kevin Kiernan of the University of Kentucky had for many years been interested in the use of digital imaging to investigate damaged texts of Old English manuscripts. As early as 1987, Kevin had used medical digital imaging equipment to try and record readings under special light sources from a complex folio of the Beowulf manuscript (British Library (BL), Cotton MS Vitellius A.xv, f. 179), described in an article available here. The advances in digital imaging equipment of the early 1990s encouraged Kevin to try a further experiment, this time using a fragment of an Old English Saint's Life from one of the Cotton manuscripts in the BL badly damaged by fire in 1731, Cotton MS Otho B.x. This fragment looks as if it had been through a volcanic eruption; even under ultra-violet light it is not readily legible. Kevin arranged for a trial at the BL of a Roche/Kontron ProgRes 3012 digital camera, which offered a live feed of the digital image, facilitating the use of special lighting sources, and produced (what seemed at the time) huge 21 megabyte images. With the aid of this camera, Kevin, with the assistance of David French of the BL's Manuscripts Conservation Studio, managed to produce a digital image of a fragment from Otho B.x under ultra-violet light clearly showing some of the Old English words in a Life of Saint Sebastian which had never before been reproduced. A reduced version of this image can be seen here.
In 1993, transporting 21 megabyte images was no easy matter. Kevin loaded the images from this experiment onto an external hard drive, but when he returned to Kentucky he found that the drive was empty, the images having presumably been zapped by the security scanners at Gatwick airport. However, fortunately Kevin had already transmitted a back-up copy of the image by phone from London to Kentucky. The phone call to transmit the image had taken some hours and the cost of the call was the astronomical sum of 55 dollars. This image was (as far as I am able to establish) the first time a digital image of a medieval manuscript was transmitted across the Atlantic. The 55 dollar phone bill seemed as, Kevin shrewdly observed at the time, 'to portend the start of something really big, expensive, and earth-shattering'.
Kevin and I went on to work together on a much larger-scale digital facsimile involving extensive use of special lighting sources, the Electronic Beowulf. In the course of assembling the digital images for this project, we had many other adventures. When Kevin and David French made a trip to the Royal Library at Copenhagen to scan the transcripts of the Beowulf manuscripts owned by the Danish antiquary Grimr Jónsson Thorkelin, as soon as the special lighting required for the digital camera was switched on, it blew the venerable fuses of the Royal Library. An eloquent testimony of the speed with which access to digital technologies has advanced over the past ten years is that the Royal Library, which in 1995 experienced such difficulties with the Electronic Beowulf scanning equipment, today presents on its websites dozens of e-manuscripts, complete digital facsimiles of medieval and modern manuscripts in its collections.
An even more striking indication of the rapid spread of digital imaging technology since the early experimental days of Electronic Beowulf is apparent at The National Archives (TNA) in London. Historians are fortunate in that the copyright of large parts of the public records in England is owned by the crown. This means that TNA were readily able to give permission to readers to take their own digital images of documents in the archives. As a result, in recent years note taking by scholars using TNA has been transformed out of recognition; many readers take images of their documents rather than transcribing them. Such is the popularity of digital cameras at TNA that a large number of extra camera stands have recently been provided to assist in taking images. Following the lead of TNA, Oxford University Library Services have also recently relaxed rules on the use of digital cameras, although many other major repositories such as the BL still prohibit the use of digital cameras in reading rooms because of copyright and handling concerns. The regulations of TNA and other repositories which allow the use of digital cameras prevent readers from sharing or distributing the images they make, but if it was possible to gather all these images together they would form a considerable digital archive.
When we began projects such as Electronic Beowulf, it was widely anticipated that the wider availability of images of archives, manuscripts and early printed materials would transform our engagement with the raw materials of history. We imagined that scholars would become less preoccupied with great overarching themes and would instead refocus on the complexities and issues associated with the primary textual materials which are at the heart of the study not only of history but also of many other humanities disciplines. Many of the high-flown claims made in the early 1990s that the availability of digital imaging and other new technologies would fundamentally transform humanities scholarship reflected the assumption that new methods of making primary materials available would produce a corresponding shift in scholarly engagement with them. Projects such as Electronic Beowulf or the major projects begun at that time in the Institute of Advanced Technology in the Humanities at the University of Virginia, such as Jerome McGann's Rossetti Archive or Edward Ayers's digital archive of two American communities in the Civil War, The Valley of the Shadow, offered profound challenges to orthodox methods of presenting primary textual and other materials and challenged conventional distinctions between, for example, editions and facsimiles. In particular, it was felt that hypertext, by reshaping the way in which the reader moved through text, would reconfigure the structure of scholarly communication.
McGann's 2002 publication Radiant Textuality: Literature after the World Wide Web (1) , part of the preface of which is available here, is an entrancing and visionary exploration of some of the possibilities of this new poetics of the web. For McGann, hypertext opened up new ways of exploring previously neglected dimensions of the text. Declaring that ‘Every document, every moment in every document, conceals (or reveals) an indeterminate set of interfaces that open into alternate spaces and temporal relations’ (2) , McGann argued that hypertext offered humanities scholars a new means of exploring these textual depths. McGann argued that these interconnections could not be investigated by simply replicating conventional editions in a machine-readable format. Pointing to the examples of the works of Rossetti and Blake, McGann showed how the incorporation of images within a hypermedia archive – something that can still only be done in a limited and clumsy fashion in a conventional printed edition – could bring to light new and neglected aspects of literary textuality. McGann saw the digital image as central to the development of such electronic editions and archives: ‘How to incorporate digitized images into the computational field is not simply a problem that hyperediting must solve; it is a problem created by the very arrival of the possibilities of hyperediting’.(3) As McGann emphasised, these issues were not restricted to the incorporation of images. In the case of authors for whom musical presentation of their work was very important, such as Robert Burns or Yeats, the incorporation of audio-visual material into such archives of their work might also be important.
McGann outlined how, in the years between 1993 and 2000, the availability of the World Wide Web opened up the possibility of complete new forms of representing and exploring historical textuality (considered in its widest sense, as also incorporating sound and image). Some years later, after the expenditure of millions of pounds on digitisation programmes, there is certainly no doubt that digital images of historical records are increasingly commonplace, but doubt must be felt as to the extent to which the vision described by McGann has been realized. Tim Hitchcock has recently noted that historians show a greater readiness to engage with visual material such as films and photographs, and has suggested that this may be due to the easier availability of such material through websites such as the Viewfinder service of the National Monuments Record, the Art and Architecture site at the Courtauld Institute or the British Pathe Film Archive. However, while such sites have made visual material more readily accessible, they do not seem to have had more than a peripheral effect on the practice of history. By and large, historians appear mainly to value digital technologies for their ability to offer improved searchability and more convenient access to conventional textual sources. The digital projects which have hitherto had the greatest impact on the study of history have been those which allow scholars to search large quantities of data more rapidly and in different configurations than was possible with the aid of a card index and a pencil. British History Online is in itself testimony to this. It has made available online many of the canonical reference tools and collections of primary sources which have traditionally lined the walls of the Institute of Historical Research (IHR). These can now be accessed 24 hours a day, anywhere in the world – a logical extension of the vision of the founders of the IHR which envisaged open shelves containing a wide selection of such reference works and primary sources as the fundamental equipment required in a laboratory of historical research.
The most familiar illustration of the way in which the development of online access to historical sources has been driven by these twin requirements of enhanced access to hard-to-obtain source materials and improved searchability is perhaps The Proceedings of the Old Bailey 1674–1913, directed by Tim Hitchcock, Robert Shoemaker and Clive Emsley. This website provides access to an edition of the printed reports of trials at the Old Bailey from 1674–1913, together with the Ordinary of Newgate's Accounts from 1690 to 1772. These records are bibliographically complex in structure and a complete set is difficult to obtain. At one level, then, the Old Bailey Online (OBO) is valuable in making access to this core text for the study of many aspects of 18th-century London extremely quick and simple. The website presents a huge amount of textual information, with reports of more than 210,000 trials and biographical details of approximately 3,000 men and women executed at Tyburn. Without automated assistance, retrieving even the simplest information such as a personal name from such a huge resource would be an overwhelming task. The Proceedings of the Old Bailey not only allow simple keyword searching and swift retrieval of information from this immense quantity of data, but also allows more sophisticated search combinations and even permits statistical information to be represented in graphical form. In order to allow the Old Bailey Proceedings to be presented in this form, however, an enormous amount of work was necessary. Optical character recognition [OCR] remains uncertain in its conversion of 18th-century letterpress, so the texts were manually typed, not just once but twice to improve accuracy. Moreover, in order to support complex search patterns, tags were inserted in the keyed text. Some automated programmes were used to insert some tags, but a considerable level of manual editorial intervention was still required. The millions of hits received by this website justify the expense of this process, but clearly the preparation of such an online edition remains an onerous undertaking.
The Proceedings of the Old Bailey include a scan in pdf format from microfilm of the original printed version of each trial, such as the example here. However, the function of the image in relation to the whole edition seems rather ancillary. Readers are urged to check the image to verify the accuracy of the transcription, but the extent to which they regularly do so is doubtful. In any case, as can be seen from the example here, the quality of these images, low resolution scans from apparently poor quality microfilm presented as small pdf files, is often indifferent and key information such as names is not easily legible online in this form. In short, looking at the Proceedings of the Old Bailey, one might be led to think that the role of digital images in the digital presentation of historical sources is marginal, and that the central concerns in making historical sources available online are fairly old-fashioned issues of transcription and indexing. However, the approach to the use of digital images in the preparation of online historical editions is by no means yet settled, and other projects and services make different uses of digital images.
The Documents Online service of TNA makes use of digital images to provide access to large classes of documents such as wills from the Prerogative Court of Canterbury without the need for extensive editorial intervention of the sort required for major editions such as the Proceedings of the Old Bailey. Within Documents Online, searchable indexes containing outline basic information about the document are made available free of charge. These are linked to scans from microfilm in pdf form which can be downloaded for a small charge. Again, the quality of the scans provided by Documents Online is insufficient to undertake any very detailed work on the document, but they do provide a means of readily making available large quantities of documents to researchers all over the world without the huge editorial overhead involved in such projects as the OBO.
Scans from microfilm are favoured by librarians and archivists for this kind of service because of their relative cheapness and the convenience of the conversion process. For this reason, they are also used in such commercial packages as ProQuest’s Early English Books Online [EEBO] and Gale’s Eighteenth Century Collections Online [ECCO]. In the case of EEBO, which contains scans from microfilm of approximately 100,000 of over 125,000 books listed in the English Short Title Catalogue as published between 1473 and 1700, searching is, like Documents Online, based primarily on existing catalogue information. Full text transcription of books in this resource is currently being undertaken by the Text Creation Partnership based at the University of Michigan. Full text coverage of EEBO is currently quite limited, and the aim of the Text Creation Partnership at present is only to transcribe about a quarter of the material in the resource. Like Documents Online, the scans from microfilm in EEBO provide a much quicker and cheaper means of providing access to historical sources than a full-dress editorial project.
By contrast, all the 150,000 books published in the 18th century, containing altogether over 26 million pages, included in ECCO are all fully searchable. This is because the search methods used in this resource do not rely on transcription, but instead use fuzzy searching techniques to interrogate versions of these texts automatically generated from the scans by optical character recognition. However, the difficulties of scanning 18th-century letter forms together with problems posed by the quality of the scans from microfilm mean that the searches in ECCO are rarely completely accurate, so that a full-text search in this resource rarely produces all the references required. Thus, the first edition of Moll Flanders, published in 1722, includes a reference to Moll and her husband going to Northampton (p. 70), but a full text search of ECCO does not retrieve this reference. To help overcome these problems, the Text Creation Partnership is also seeking to produce full transcriptions of the books in ECCO. Further details of this project are available here, but the current demonstration is restricted to just 23 well-known 18th-century books. These books do, however, include Moll Flanders and the demonstration search engine duly enables the reference to Northampton in Moll Flanders to be quickly located.
Similar methods to those used in ECCO have been used recently to produce searchable archives of images of newspaper collections scanned from microfilm such as Gale’s Times Digital Archive and collections of British Library Newspapers. In these cases, digital images have been used to produce searchable versions of large archives for which extended transcription would be impracticable. For all the limitations of the OCR based approach, without the use of digital images the production of these resources would simply have not been feasible without very large financial resources, beyond the scope of most funding councils and commercial bodies. The biggest disappointment of these resources (and, incidentally, part of the reason why the OCR is not very accurate) is perhaps the poor quality of the images themselves, grey scale scans from microfilm. While these images are generally easier to use than microfilm, they are nevertheless vastly inferior to high resolution colour images.
In 1766, Dr Thomas Bowles, who could not speak Welsh, was appointed as rector of the parish of Trefdraeth in Anglesey, where only five of the 500 parishioners could speak English. The resulting campaign to oust Bowles led to a case against him in the Court of Arches in 1770 and details of this prosecution were published as a pamphlet by the London-based Welsh literary society, the Cymmrodorion, in 1773. This pamphlet is duly included from the British Library’s copy in ECCO and, if your institution subscribes to ECCO, can be accessed here. It is a black and white scan from microfilm in which there is occasionally unevenness of tone but otherwise completely legible. However, because of the importance of the Bowles case in the struggle to defend the Welsh language, this pamphlet is also available as a full colour facsimile from the National Library of Wales’s Digital Mirror. While the ECCO version is perfectly acceptable as a means of obtaining quick access to the text, for a closer study of the characteristics of the pamphlet, the colour images are vastly superior. In the National Library of Wales copy, the dedication of the pamphlet to the powerful local nobleman Sir William Watkins Wynne is placed before the title page, giving it greater prominence, whereas in the British Library copy it is placed afterwards. Examination of the colour images draws attention to the way in which the printer mixes different fonts in the prefatory matter apparently to help make the message of the pamphlet more prominent. This is by no means so readily apparent in the ECCO scans. In short, scans from microfilm convey information perfectly adequately, but if we want to start to examine a book or manuscript as an artefact, we need colour scans. However, the colour scans of the National Library copy are not searchable, whereas ECCO is.
The value of colour scans is particularly apparent in the case of manuscripts. The wide-ranging selection of digital facsimiles on the National Library of Wales website includes Lloyd George’s diary for 1886. A glance at a page like this, the entry for 10 and 11 January, illustrates how a colour image makes more apparent the way in which Lloyd George compiled his diary by making brief notes at different times during the day. A site like that of the National Library, containing complete colour facsimiles of over 100 manuscripts and records, already provides a substantial digital library with which much useful research could be undertaken. For manuscripts such as those presented on this site, colour images are essential for detailed work and indeed, in some cases, even larger images than those currently provided are required. For example, another manuscript made available by the National Library as a digital facsimile is Peniarth MS 20, a 14th-century manuscript of Brut y Tywysogion (The Chronicle of the Princes). Among the texts at the beginning of this manuscript is a Welsh translation of part of the Promptuarium Bibliae. As can be seen from this folio and this folio, part of this text is badly faded and illegible at the resolution offered in the digital facsimile. A larger image would make this section of Peniarth MS 20 easier to read.
While the images on the National Library of Wales's Digital Mirror are of high quality, the selection of material on this site is primarily aimed at educational rather than research purposes. The aim is clearly to present digital facsimiles of key documents of Welsh history, literature and culture which are mostly already well known and thoroughly studied. This educational emphasis is apparent on many other library and museum sites presenting colour digital images of manuscripts, archives and printed books. This apparently reflects the requirements of the bodies which funded the development of these sites. The Collections pages of the Victoria and Albert Museum contain a searchable database of over 30,000 objects from the Museum's collections. However, these are clearly intended to display the diversity of the objects in the Museum and are selected for their educational value rather than their research potential. This is also evident from the accompanying captions which are in the style of an exhibition label. A similar approach is also evident in the National Maritime Museum's Collections Online pages. Similarly, the British Library's Online Gallery, which contains at the time of writing over 30,000 colour images of items from the British Library's collections, is, as the opening animation makes clear, conceived as an educational resource and apparently intended to compete with parallel offering from museums. The panel marked Historical Texts offers to take the user to some pivotal moments in British history. Clicking on this panel takes you to a motley series of 'iconic' images reflecting a thoroughly fuddy-duddy conception of British history, based around Bede, Magna Carta, Queen Elizabeth, Nelson and Captain Scott. The site assumes that the user will be satisfied with a single image of an early Bede manuscript, and will not wish to explore further. Those who do wish to get a deeper understanding of particular texts such as the Sherborne Missal or a Leonardo Notebook are encouraged to use the Library's gimmicky, and thoroughly Microsoft-based, Turning the Pages software. The idea that a researcher might require online access to a high-quality, unmediated digital facsimile of a manuscript which has only received limited public attention appears alien to those responsible for 'learning' in the British Library. The crude way in which images have been shoehorned by the British Library into a broadbrush educational resource is evident from the way in which search functions and metadata have been handled. No advance search facility is available, so that it is impossible easily to retrieve manuscripts by shelfmark, date or provenance. The extent to which detailed or accurate information is provided about the images is patchy; in some cases full and informative captions are provided, while in others information is limited or misleading. In many cases, such essential information as the manuscript number is concealed under a button with the unenticing label 'More metadata'. Worst of all, the site, which only works properly in Internet Explorer, provides only individual images zoomable using Flash – adequate for the internet tourist, but useless for serious research.
Unfortunately, sites like these, where education and marketing considerations mean that images of historical sources are provided in a dispersed and inchoate fashion, are all too common. Nevertheless, the National Library of Wales's Digital Mirror site shows how the demands of education do not necessarily prevent the provision of high quality complete digital facsimiles. And projects are now coming to fruition which are offering collections of digital facsimiles of manuscripts with real potential for researchers. Among the best known and most useful of these is Codices Electronici Ecclesiae Coloniensis, which has digitised over 200 manuscripts containing over 65,000 pages from the medieval library of the Bishops and Cathedral of Cologne. The collection of images of over 60 early medieval and Celtic manuscripts from Oxford University digitised by David Cooper in the mid-1990s remain an outstanding and surprisingly underused resource. Another major and very useful resource is the Irish Script on Screen project at the Dublin Institute for Advanced Studies. It is only with the creation of larger collections of complete digital facsimiles of this kind, drawn from a wide range of periods and types of document, that the digital image will finally fulfil its promise of transforming historical technique.
Another way in which the use of digital images by historians will become more widespread is through the incorporation of images of objects in library, archive and museum catalogues. Some sense of the potential here can be seen by looking at a number of specialist databases, such as the database of published illustrations of Athenian black and white figure pottery in the Beazley archive of the Classical Art History Centre of the University of Oxford. The inclusion of published images of the pottery in the database brings this apparently dry subject alive in a way that is otherwise difficult to imagine. Likewise, the catalogue of the Library and Museum of Freemasonry in London incorporates images of masonic jewels, revealing that this little-known category of artefact has considerable historical interest, as can be seen from the examples here and here. The Library and Museum of Freemasonry has also recently added images to the catalogue entries for its photographic collections, so that its catalogue forms a photographic gallery of a significant part of British elite society in the late 19th century. Other specialist catalogues and databases which make effective use of images include two British Academy funded projects, the Corpus of Romanesque Sculpture and the Corpus Vitrearum Medii Aevi. These specialist image databases are now being joined by larger collection databases providing thousands of images of objects from major collections such as the British Museum Collection Database, which currently contains over 250,000 images. The National Library of Wales has blazed a very significant trail by incorporating photographs and other images as part of its main catalogue. As approaches like this become more common and as these larger collection databases reach maturity, they will start to have a profound effect on the use of images in historical research.
The early years of digitisation also gave indications of other approaches which are proving increasingly important and will start to become increasingly commonplace in the near future. The use of digital imaging to explore damaged and difficult readings in the Beowulf manuscript paved the way for a number of projects using imaging for the forensic exploration of primary materials. The Digital Image Archive of Medieval Manuscript Music uses techniques similar to those pioneered in the Electronic Beowulf project to recover medieval music from fragments preserved in pastedowns and bindings. Forensic imaging techniques have also been used in the study of the ink writing tablets found at the Roman fort of Vindolanda, while Meg Twycross at Lancaster University has discussed in a forthcoming study the use of special lighting techniques and image enhancement to examine the genesis of documents relating to the York Mystery Cycle.
Patterns of the use of digital images in historical research have often been haphazard and the image of the document has too often been made subsidiary to the edited text. Museums and libraries too often see the image chiefly as an educational and marketing tool. As Jerome McGann pointed out, the inclusion of digital images poses major issues for the creator of digital corpora, but there are signs that historians are becoming increasingly aware of these issues and are beginning to understand the potential of images to transform their engagement with primary material. The transformative effect of the digital image on the historian's interaction with the materials of the past can only continue to gain momentum. Increasingly, digital images will themselves start to form the primary sources with which historians work. To take one example, one of the striking features of the London terrorist attacks in July 2005 was the extent to which images of the events were made by members of the public using mobile phones and low-end digital cameras. As gathered together on sites such as Flickr, these form an important part of the documentation of the day. These, together with other digital material such as CCTV footage, and blogs (another example here), will be indispensable sources for historians studying the events of that day, as is evident from Mike Thelwall's analysis, available here. Increasingly, historians using this kind of material will wish to incorporate images and other digital objects directly into their historical writing, and new forms of historical writing will emerge, offering different narrative and analytical shapes as historians seek to juxtapose and integrate their digital materials in new patterns. As digital images become increasingly important as historical sources, so they will also correspondingly emerge not only as one of the key tools available to the historian for investigating sources of all periods but also as one of the forces driving the development of new forms of history.
Dr Andrew Prescott is Manager of Library Services at the University of Wales Lampeter.