TY - GEN
T1 - 3D Mitochondria Instance Segmentation with Spatio-Temporal Transformers
AU - Thawakar, Omkar
AU - Anwer, Rao Muhammad
AU - Laaksonen, Jorma
AU - Reiner, Orly
AU - Shah, Mubarak
AU - Khan, Fahad Shahbaz
N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.
PY - 2023/10
Y1 - 2023/10
N2 - Accurate 3D mitochondria instance segmentation in electron microscopy (EM) is a challenging problem and serves as a prerequisite to empirically analyze their distributions and morphology. Most existing approaches employ 3D convolutions to obtain representative features. However, these convolution-based approaches struggle to effectively capture long-range dependencies in the volume mitochondria data, due to their limited local receptive field. To address this, we propose a hybrid encoder-decoder framework based on a split spatio-temporal attention module that efficiently computes spatial and temporal self-attentions in parallel, which are later fused through a deformable convolution. Further, we introduce a semantic foreground-background adversarial loss during training that aids in delineating the region of mitochondria instances from the background clutter. Our extensive experiments on three benchmarks, Lucchi, MitoEM-R and MitoEM-H, reveal the benefits of the proposed contributions achieving state-of-the-art results on all three datasets. Our code and models are available at https://github.com/OmkarThawakar/STT-UNET.
AB - Accurate 3D mitochondria instance segmentation in electron microscopy (EM) is a challenging problem and serves as a prerequisite to empirically analyze their distributions and morphology. Most existing approaches employ 3D convolutions to obtain representative features. However, these convolution-based approaches struggle to effectively capture long-range dependencies in the volume mitochondria data, due to their limited local receptive field. To address this, we propose a hybrid encoder-decoder framework based on a split spatio-temporal attention module that efficiently computes spatial and temporal self-attentions in parallel, which are later fused through a deformable convolution. Further, we introduce a semantic foreground-background adversarial loss during training that aids in delineating the region of mitochondria instances from the background clutter. Our extensive experiments on three benchmarks, Lucchi, MitoEM-R and MitoEM-H, reveal the benefits of the proposed contributions achieving state-of-the-art results on all three datasets. Our code and models are available at https://github.com/OmkarThawakar/STT-UNET.
UR - http://www.scopus.com/inward/record.url?scp=85174673767&partnerID=8YFLogxK
U2 - https://doi.org/10.1007/978-3-031-43993-3_59
DO - https://doi.org/10.1007/978-3-031-43993-3_59
M3 - منشور من مؤتمر
SN - 9783031439926
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 613
EP - 623
BT - Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings
A2 - Greenspan, Hayit
A2 - Madabhushi, Anant
A2 - Mousavi, Parvin
A2 - Salcudean, Septimiu
A2 - Duncan, James
A2 - Syeda-Mahmood, Tanveer
A2 - Taylor, Russell
PB - Springer Science and Business Media B.V.
T2 - 26th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2023
Y2 - 8 October 2023 through 12 October 2023
ER -