Continuous Learning in Large-scale Problems: The Case of Multi-script Historical Handwritten Document Collections

Activity: Talk or presentationProfessional

Lambert Schomaker - Speaker

[invited keynote lecture] Recent advances in deep learning by means of convolutional neural networks are very impressive in many application domains. Are these methods also suitable for the recognition of -hitherto unseen- handwritten documents in a rare script and language? What to do if the amount of training data is severely limited? What to do if user requirements are continuously changing over time, requiring not only text recognition but also the characterization of documents in terms of writer identity, general style or 'estimated date of production'? The presentation will focus on the discrepancies between current research habits and the requirements of large-scale machine learning in big data in the highly time-variant context of the Monk project. Many low-level technical adaptations are needed to elevate toy-level machine learning tools to a high-performance computing context. More interestingly, new concepts are needed from AI, to allow for an increasingly autonomous mode of learning, as opposed to the current non-scalable paradigm which is characterized by 'one data set / one PhD student / good results'. Results indicate that modern deep learning and regular pattern recognition need to live side by side in a large-scale real-world system in order to realize usable results, in a process of mutual collaboration between end users and computing resources.

Event (Conference)

TitleInternational Conference on Agents and Artificial Intelligence
Abbrev. TitleICAART
Web address (URL)
CountryCzech Republic
Degree of recognitionInternational event

ID: 76385315