Deep fused two-step cross-modal hashing with multiple semantic supervision

Peipei Kang, Zehang Lin, Zhenguo Yang, Alexander M. Bronstein, Qing Li, Wenyin Liu

פרסום מחקרי: פרסום בכתב עתמאמרביקורת עמיתים

תקציר

Existing cross-modal hashing methods ignore the informative multimodal joint information and cannot fully exploit the semantic labels. In this paper, we propose a deep fused two-step cross-modal hashing (DFTH) framework with multiple semantic supervision. In the first step, DFTH learns unified hash codes for instances by a fusion network. Semantic label and similarity reconstruction have been introduced to acquire binary codes that are informative, discriminative and semantic similarity preserving. In the second step, two modality-specific hash networks are learned under the supervision of common hash codes reconstruction, label reconstruction, and intra-modal and inter-modal semantic similarity reconstruction. The modality-specific hash networks can generate semantic preserving binary codes for out-of-sample queries. To deal with the vanishing gradients of binarization, continuous differentiable tanh is introduced to approximate the discrete sign function, making the networks able to back-propagate by automatic gradient computation. Extensive experiments on MIRFlickr25K and NUS-WIDE show the superiority of DFTH over state-of-the-art methods.

שפה מקוריתאנגלית
עמודים (מ-עד)15653-15670
מספר עמודים18
כתב עתMultimedia Tools and Applications
כרך81
מספר גיליון11
מזהי עצם דיגיטלי (DOIs)
סטטוס פרסוםפורסם - מאי 2022

ASJC Scopus subject areas

  • ???subjectarea.asjc.1700.1712???
  • ???subjectarea.asjc.2200.2214???
  • ???subjectarea.asjc.1700.1708???
  • ???subjectarea.asjc.1700.1705???

טביעת אצבע

להלן מוצגים תחומי המחקר של הפרסום 'Deep fused two-step cross-modal hashing with multiple semantic supervision'. יחד הם יוצרים טביעת אצבע ייחודית.

פורמט ציטוט ביבליוגרפי