PhD project: Thai character detection and recognition in natural scene images for portable devices

Name: Bowornrat Sriman

Supervisors:
prof. dr. L.R.B. (Lambert) Schomaker
Co-supervisor:
Asst.prof.dr.Chatklaw Jareanpon (Faculty of Informatics, Mahasarakham University, Thailand)

Summary of PhD project:

Optical Character Recognition (OCR) is an important application of computer vision and be widely applied for a variety of alternative purposes such as the recognition of the street signs or buildings in natural scene images. The symbolic tagging of objects in the Google Streetview, for instance, benefits from the scene text recognition. This would allow for the tagging of objects based on texts that occur in scene images, automatically. Information of tagged objects can be used further for scene analysis and data mining purposes. Another example, language translation ‘apps’ on smart-phone using character recognition in the scene image helps the tourists to understand a local script when they are travelling abroad. These examples show that the OCR becomes more crucial recently.

To recognize a text from photographs, the characters first need to be identified, but the scene images, whether from a mobile device or a regular digital camera contain many obstacles that affect the character identification performance. Visual recognition problems such as luminance noise, varying 2D and 3D font styles, cluttered background, occlusion, or distortion cause difficulties in the OCR process. On the other hand, scanned documents usually include flat, machine printed characters, which are in ordinary font styles, similar color, stable lighting and clear against a plain background. For these reasons, the OCR of photographic scene images is still a challenge.

Bowornrat et al (2014) “Sample pictures of visual recognition problems in scene images for text recognition”

My project covers all the OCR processes on portable device. The new datasets of natural scene images were collected. Thai script will be used for my study as the main dataset. Based on the expectancy driven approach, the character models and the codebooks of the Thai texts were created using several feature extraction techniques e.g., SIFT, SURF or HOG. The codebooks will be used to detect the area of text in a scene image. Text detection algorithms will be applied such as Object Attention Patches or Connected Component. The region of text in the scene image will be identified, and it is also need some AI techniques to improve the accuracy of the detection. Machine learning algorithms, for example, SVM, ANN will be studied. After that, the segmentation of the characters from the detected region will be performed using some well-known techniques like MSER or Connected Component. Individual characters are then extracted from the image. The classification, recognition and correction of the character will be next implemented. Finally, all the processes will be implemented on a portable device and evaluated in the last part of my project.

Last modified:

13 December 2022 1.23 p.m.