Paper
19 January 2009 Text line extraction in free style document
Author Affiliations +
Proceedings Volume 7247, Document Recognition and Retrieval XVI; 72470L (2009) https://doi.org/10.1117/12.805695
Event: IS&T/SPIE Electronic Imaging, 2009, San Jose, California, United States
Abstract
This paper addresses to text line extraction in free style document, such as business card, envelope, poster, etc. In free style document, global property such as character size, line direction can hardly be concluded, which reveals a grave limitation in traditional layout analysis. 'Line' is the most prominent and the highest structure in our bottom-up method. First, we apply a novel intensity function found on gradient information to locate text areas where gradient within a window have large magnitude and various directions, and split such areas into text pieces. We build a probability model of lines consist of text pieces via statistics on training data. For an input image, we group text pieces to lines using a simulated annealing algorithm with cost function based on the probability model.
© (2009) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xiaolu Shen, Changsong Liu, Xiaoqing Ding, and Yanming Zou "Text line extraction in free style document", Proc. SPIE 7247, Document Recognition and Retrieval XVI, 72470L (19 January 2009); https://doi.org/10.1117/12.805695
Lens.org Logo
CITATIONS
Cited by 3 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Error analysis

RGB color model

Algorithms

Statistical analysis

Statistical modeling

Binary data

RELATED CONTENT


Back to Top