TY - JOUR
T1 - Histology segmentation using active learning on regions of interest in oral cavity squamous cell carcinoma
AU - Folmsbee, Jonathan
AU - Zhang, Lei
AU - Lu, Xulei
AU - Rahman, Jawaria
AU - Gentry, John
AU - Conn, Brendan
AU - Vered, Marilena
AU - Roy, Paromita
AU - Gupta, Ruta
AU - Lin, Diana
AU - Samankan, Shabnam
AU - Dhorajiva, Pooja
AU - Peter, Anu
AU - Wang, Minhua
AU - Israel, Anna
AU - Brandwein-Weber, Margaret
AU - Doyle, Scott
N1 - Publisher Copyright: © 2022 The Authors
PY - 2022/1
Y1 - 2022/1
N2 - In digital pathology, deep learning has been shown to have a wide range of applications, from cancer grading to segmenting structures like glomeruli. One of the main hurdles for digital pathology to be truly effective is the size of the dataset needed for generalization to address the spectrum of possible morphologies. Small datasets limit classifiers’ ability to generalize. Yet, when we move to larger datasets of whole slide images (WSIs) of tissue, these datasets may cause network bottlenecks as each WSI at its original magnification can be upwards of 100 000 by 100 000 pixels, and over a gigabyte in file size. Compounding this problem, high quality pathologist annotations are difficult to obtain, as the volume of necessary annotations to create a classifier that can generalize would be extremely costly in terms of pathologist-hours. In this work, we use Active Learning (AL), a process for iterative interactive training, to create a modified U-net classifier on the region of interest (ROI) scale. We then compare this to Random Learning (RL), where images for addition to the dataset for retraining are randomly selected. Our hypothesis is that AL shows benefits for generating segmentation results versus randomly selecting images to annotate. We show that after 3 iterations, that AL, with an average Dice coefficient of 0.461, outperforms RL, with an average Dice Coefficient of 0.375, by 0.086.
AB - In digital pathology, deep learning has been shown to have a wide range of applications, from cancer grading to segmenting structures like glomeruli. One of the main hurdles for digital pathology to be truly effective is the size of the dataset needed for generalization to address the spectrum of possible morphologies. Small datasets limit classifiers’ ability to generalize. Yet, when we move to larger datasets of whole slide images (WSIs) of tissue, these datasets may cause network bottlenecks as each WSI at its original magnification can be upwards of 100 000 by 100 000 pixels, and over a gigabyte in file size. Compounding this problem, high quality pathologist annotations are difficult to obtain, as the volume of necessary annotations to create a classifier that can generalize would be extremely costly in terms of pathologist-hours. In this work, we use Active Learning (AL), a process for iterative interactive training, to create a modified U-net classifier on the region of interest (ROI) scale. We then compare this to Random Learning (RL), where images for addition to the dataset for retraining are randomly selected. Our hypothesis is that AL shows benefits for generating segmentation results versus randomly selecting images to annotate. We show that after 3 iterations, that AL, with an average Dice coefficient of 0.461, outperforms RL, with an average Dice Coefficient of 0.375, by 0.086.
KW - Active learning
KW - Computational pathology
KW - Digital pathology
KW - Oral cavity cancer
KW - Region of interest
KW - Semantic segmentation
KW - U-net
KW - Whole slide imaging
UR - http://www.scopus.com/inward/record.url?scp=85139351162&partnerID=8YFLogxK
U2 - 10.1016/j.jpi.2022.100146
DO - 10.1016/j.jpi.2022.100146
M3 - مقالة
C2 - 36268093
SN - 2229-5089
VL - 13
JO - Journal of Pathology Informatics
JF - Journal of Pathology Informatics
M1 - 100146
ER -