Skip to ContentSkip to Navigation
Research Bernoulli Institute Autonomous Perceptive Systems Research

PhD project: MPS - the Medieval Palaeographic Scale for dating and localizing handwritten manuscripts using digital palaeography

Name: Sheng He

Supervisor:
prof. dr. L.R.B. (Lambert) Schomaker

Summary of PhD project:

Virtually all scholars of various disciplines studying the Middle Ages often are confronted with the serious problem that their primary sources, namely manuscripts, bear no indication of the time and place in which they were written. This makes it hard, or even impossible, to assess their reliability as a historical source ? it is as if an archaeologist would have no idea of the age and provenance of the material he is examining. The necessary dating and also the geographical localisation of a manuscript can often only be achieved on the basis of a judgment of its handwriting characteristics by a mere handful of specialists, who often come to conflicting conclusions. Usually, the dating of a script is based on the individual non-verbal intuition of the expert rather than on objective criteria.

This state of affairs is not surprising, because there is a notorious lack of a collection of dated manuscripts as the reference corpus. As the archaeologist has the C-14 technique to date organic materials, so the medievalist needs a method of dating manuscripts. The current project aims at constructing an objective palaeograpical 'scale' of datable elements in late medieval handwriting (1300-1550). This scale will be based on material that hitherto was neglected for this purpose: charters and other documents in the (city) archives, material that is generally precisely dated and localized. These administrative documents were often written by the same scribes who wrote the undated manuscripts, using the same types of script. The method brings together two domains of expertise. A palaeographer will make a careful selection of documents from several different city archives, and analyze their handwriting. In close co-operation, an expert from the field of pattern recognition or machine learning will construct algorithms that also will be able to estimate the date of a handwritten specimen on the basis of training on a reference data set and testing on known and unknown samples. The use of the computer is an innovative aspect of the project, thanks to the brand new discipline of 'digital palaeography'. Already, there are various computer programs for automatic writer identification, the most promising of which is in this context surely the Groningen Intelligent Writer Identification System (GIWIS; Brink et al., 2011), recently developed at the University of Groningen, which has already proven that it is quite capable of handling medieval script. If it is demonstrated that the computer can classify specimens of that script according to individual scribal features, it is reasonable to expect that more general shape classes such as those which are characteristic for the script of a certain period and/or region can be automatically detected.

The GIWIS approach make use of several 'shape feature' groups, each tapping into a particularity of a script type: textural, allographic, and ink-trace formation oriented, the latter feature group focusing specifically on the characteristics of manuscripts written with a quill. Whereas the broad knowledge of the human palaeographer is essential, computers can work much faster, thus facilitating a rapid first selection of huge quantities of digitized manuscript images. In the end, the proposed palaeographical scale will make the task of dating medieval manuscripts more objective, much quicker, and less dependent on the individual expert. The scale can be used instantly by countless medievalists in the Netherlands and abroad, thus facilitating the scholarly use of thousands of medieval manuscripts. A subsequent adaptation of the scale can make it fit for Early Modern and modern handwriting, which will make it a valuable tool for new groups of researchers. A secondary result of the proposed system is that the reference collection can also be used for research in the area of optical character recognition of the content of medieval manuscripts. This project will strengthen the Dutch position in the international field of digital palaeography, and it will be another step in the direction of the ultimate goal of teaching the computer to read historical handwritten documents.

Last modified:13 December 2022 1.23 p.m.