Abstract
Modern depth sensors are often characterized by low spatial resolution, which hinders their use in real-world applications. However, the depth map in many scenarios is accompanied by a corresponding high-resolution color image. In light of this, learning-based methods have been extensively used for guided super-resolution of depth maps. A guided super-resolution scheme uses a corresponding high-resolution color image to infer high-resolution depth maps from low-resolution ones. Unfortunately, these methods still have texture copying problems due to improper guidance from color images. Specifically, in most existing methods, guidance from the color image is achieved by a naive concatenation of color and depth features. In this paper, we propose a fully transformer-based network for depth map super-resolution. A cascaded transformer module extracts deep features from a low-resolution depth. It incorporates a novel cross-attention mechanism to seamlessly and continuously guide the color image into the depth upsampling process. Using a window partitioning scheme, linear complexity in image resolution can be achieved, so it can be applied to high-resolution images. The proposed method of guided depth super-resolution outperforms other state-of-the-art methods through extensive experiments.
| Original language | English |
|---|---|
| Article number | 2723 |
| Journal | Sensors |
| Volume | 23 |
| Issue number | 5 |
| DOIs | |
| State | Published - Mar 2023 |
Keywords
- attention
- deep learning
- depth maps
- multimodal
- super-resolution
- transformers
ASJC Scopus subject areas
- Analytical Chemistry
- Information Systems
- Biochemistry
- Atomic and Molecular Physics, and Optics
- Instrumentation
- Electrical and Electronic Engineering
Fingerprint
Dive into the research topics of 'Fully Cross-Attention Transformer for Guided Depth Super-Resolution'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver