Automatic Synthesis of Historical Arabic Text for Word-Spotting

Majeed Kassis, Jihad El-Sana

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


We present a novel framework for automatic and efficient synthesis of historical handwritten Arabic text. The main purpose of this framework is to assist word spotting and keyword searching in handwritten historical documents. The proposed framework consists of two main procedures: building a letter connectivity map and synthesizing words. A letter connectivity map includes multiple instances of the various shape of each letter, since a letter in Arabic usually has multiple shapes depends in its position in the word. Each map represents one writer and encodes the specific handwriting style. The letter connectivity map is used to guide the synthesis of any Arabic continuous subword, word, or sentence. The proposed framework automatically generates the letter connectivity map annotation from a several pages historical pages previously annotated. Once the letter connectivity map is available our framework can synthesis the pictorial representation of any Arabic word or sentence from their text representation. The writing style of the synthesized text resembles the writing style of the input pages. The synthesized words can be used in word-spotting and many other historical document processing applications. The proposed approach provides an intuitive and easy-to-use framework to search for a keyword in the rest of the manuscript. Our experimental study shows that our approach enables accurate results in word spotting algorithms.

Original languageAmerican English
Title of host publicationProceedings - 12th IAPR International Workshop on Document Analysis Systems, DAS 2016
Number of pages6
ISBN (Electronic)9781509017928
StatePublished - 10 Jun 2016
Event12th IAPR International Workshop on Document Analysis Systems, DAS 2016 - Santorini, Greece
Duration: 11 Apr 201614 Apr 2016

Publication series

NameProceedings - 12th IAPR International Workshop on Document Analysis Systems, DAS 2016


Conference12th IAPR International Workshop on Document Analysis Systems, DAS 2016

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Library and Information Sciences


Dive into the research topics of 'Automatic Synthesis of Historical Arabic Text for Word-Spotting'. Together they form a unique fingerprint.

Cite this