Beyond OCR: Handwritten manuscript attribute understanding
PhD ceremony: | Mr S. (Sheng) He |
When: | March 17, 2017 |
Start: | 12:45 |
Supervisors: | prof. dr. L.R.B. (Lambert) Schomaker, prof. dr. J.W.J. Burgers |
Where: | Academy building RUG |
Faculty: | Science and Engineering |

Knowing the author, date and location of handwritten historical documents is very important for historians to completely understand and reveal the valuable information they contain. In this thesis, three attributes, such as writer, date and geographical location, are studied by analyzing the handwriting style contained in manuscript images and develop novel algorithms to estimate these attributes on the basis of pattern recognition methods.
Handwriting styles are different between different individuals and implicitly encoded in the handwritten patterns when they were written down. This information can be used for writer identification. In this thesis, different features, such as textural-based, textural-free and grapheme-based features, aredesigned and extracted to present the handwriting style of historical handwritten documents in particular. These features are computational efficient and explainable to end users.
According to paleographical expertise, handwriting styles change gradually, continuously and in general within a relatively limited time frame, within 25 years. Modeling the gradual style evolution can be used to date and localize historical manuscripts. This thesis designed a system to date the chartersproduced between 1300 and 1550 CE in the Medieval Dutch language area.
We have shown that designed shape features can be applied quickly and conveniently, without much training efforts on new data sets and problems, even in conditions where the amount of labeled data is relatively limited.