Fast key-word searching using 'BoostMap' based embedding

Raid Saabni, Alexander Bronstein

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Dynamic Time Warping (DTW), is a simple but efficient technique for matching sequences with rigid deformation. Therefore, it is frequently used for matching shapes in general, and shapes of handwritten words in Document Image Analysis tasks. As DTW is computationally expensive, efficient algorithms for fast computation are crucial. Retrieving images from large scale datasets using DTW, suffers from the constraint of linear searching of all sample in the datasets. Fast approximation algorithms for image retrieval are mostly based on normed spaces where the triangle inequality holds, which is unfortunately not the case with the DTW metric. In this paper we present a novel approach for fast search of handwritten words within large datasets of shapes. The presented approach is based on the Boost- Map [1] algorithm, for embedding the feature space with the DTW measurement to an euclidean space and use the Local Sensitivity Hashing algorithm (LSH) to rank the knearest neighbors of a query image. The algorithm, first, processes and embeds objects of the large data sets to a normed space. Fast approximation of κ-nearest neighbors using LSH on the embedding space, generates the top kranked samples which are examined using the real DTW distance to give final accurate results. We demonstrate our method on a database of 45, 800 images of word-parts extracted from the IFN/ENIT database [11] and images collected from 51 different writers. Our method achieves a speedup of 4 orders of magnitude over the exact method, at the cost of only a 2.2% reduction in accuracy.

Original languageEnglish
Title of host publicationProceedings - 13th International Conference on Frontiers in Handwriting Recognition, ICFHR 2012
Pages734-739
Number of pages6
DOIs
StatePublished - 2012
Externally publishedYes
Event13th International Conference on Frontiers in Handwriting Recognition, ICFHR 2012 - Bari, Italy
Duration: 18 Sep 201220 Sep 2012

Publication series

NameProceedings - International Workshop on Frontiers in Handwriting Recognition, IWFHR

Conference

Conference13th International Conference on Frontiers in Handwriting Recognition, ICFHR 2012
Country/TerritoryItaly
CityBari
Period18/09/1220/09/12

Keywords

  • Adaboost
  • BoostMap
  • Dynamic time warping
  • Embedding
  • Nearest neighbor
  • Word searching

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Fast key-word searching using 'BoostMap' based embedding'. Together they form a unique fingerprint.

Cite this