TY - GEN
T1 - Simplifying the reading of historical manuscripts
AU - Asi, Abedelkadir
AU - Cohen, Rafi
AU - Kedem, Klara
AU - El-Sana, Jihad
N1 - Publisher Copyright: © 2015 IEEE.
PY - 2015/11/20
Y1 - 2015/11/20
N2 - Complex document layouts pose prominent challenges for document image understanding algorithms. These layouts impose irregularities on the location of text paragraphs which consequently induces difficulties in reading the text. In this paper we present a robust framework for analyzing historical manuscripts with complex layouts. This framework aims to provide a convenient reading experience for historians through topnotch algorithms for text localization, classification and dewarping. We segment text into spatially coherent regions and text-lines using texture-based filters and refine this segmentation by exploiting Markov Random Fields (MRFs). A principled technique is presented for dewarping curvy text regions using a non-linear geometric transformation. The framework has been validated using a subset of a publicly available dataset of historical documents and it provided promising results.
AB - Complex document layouts pose prominent challenges for document image understanding algorithms. These layouts impose irregularities on the location of text paragraphs which consequently induces difficulties in reading the text. In this paper we present a robust framework for analyzing historical manuscripts with complex layouts. This framework aims to provide a convenient reading experience for historians through topnotch algorithms for text localization, classification and dewarping. We segment text into spatially coherent regions and text-lines using texture-based filters and refine this segmentation by exploiting Markov Random Fields (MRFs). A principled technique is presented for dewarping curvy text regions using a non-linear geometric transformation. The framework has been validated using a subset of a publicly available dataset of historical documents and it provided promising results.
UR - http://www.scopus.com/inward/record.url?scp=84962502266&partnerID=8YFLogxK
U2 - 10.1109/ICDAR.2015.7333877
DO - 10.1109/ICDAR.2015.7333877
M3 - Conference contribution
T3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
SP - 826
EP - 830
BT - 13th IAPR International Conference on Document Analysis and Recognition, ICDAR 2015 - Conference Proceedings
T2 - 13th International Conference on Document Analysis and Recognition, ICDAR 2015
Y2 - 23 August 2015 through 26 August 2015
ER -