Publication

Thick 2D relations for document understanding

Aiello, M. & Smeulders, A. M. W., 2004, In : Information Sciences. 167, p. 147-176 30 p.

Research output: Contribution to journalArticleAcademic

Copy link to clipboard

Documents

  • Thick 2D relations for document understanding

    Final publisher's version, 707 KB, PDF document

    Request copy

DOI

We use a propositional language of qualitative rectangle relations to detect the reading order from document images. To this end, we define the notion of a document encoding rule and we analyze possible formalisms to express document encoding rules such as LaTeX and SGML. Document encoding rules expressed in the propositional language of rectangles are used to build a reading order detector for document images. In order to achieve robustness and avoid brittleness when applying the system to real life document images, the notion of a thick boundary interpretation for a qualitative relation is introduced. The framework is tested on a collection of heterogeneous document images showing recall rates up to 89%.
Original languageEnglish
Pages (from-to)147-176
Number of pages30
JournalInformation Sciences
Volume167
Publication statusPublished - 2004
Externally publishedYes

    Keywords

  • Constraint satisfaction: applications, Bidimensional Allen relations, Spatial reasoning, Document understanding, Document image analysis

ID: 14407231