TY - GEN
T1 - PETA
T2 - 26th International Conference on Pattern Recognition, ICPR 2022
AU - Glaser, Tamar
AU - Ben-Baruch, Emanuel
AU - Sharir, Gilad
AU - Zamir, Nadav
AU - Noy, Asaf
AU - Zelnik-Manor, Lihi
N1 - Publisher Copyright: © 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - In recent years the amounts of personal photos captured increased significantly, giving rise to new challenges in high-level multi-image understanding. Event recognition in personal photo albums presents one challenging scenario where life events are recognized from a disordered collection of images, including both relevant and irrelevant images. Event recognition in images also presents the challenge of high-level image understanding, as opposed to low-level image object classification. In absence of methods to analyze multiple inputs, previous methods adopted temporal mechanisms, including various forms of recurrent neural networks. However, their effective temporal window is local. In addition, they are not a natural choice given the disordered characteristic of photo albums. We address this gap with a tailor-made solution, combining the power of CNNs for image representation and transformers for album representation to perform global reasoning on image collection, offering a practical and efficient solution for photo albums event recognition. Our solution reaches state-of-the-art results on three prominent benchmarks, achieving above 90% mAP on all datasets. We further explore the related image-importance task in event recognition, demonstrating how the learned attentions correlate with the human-annotated importance for this subjective task, thus opening the door for new applications.1
AB - In recent years the amounts of personal photos captured increased significantly, giving rise to new challenges in high-level multi-image understanding. Event recognition in personal photo albums presents one challenging scenario where life events are recognized from a disordered collection of images, including both relevant and irrelevant images. Event recognition in images also presents the challenge of high-level image understanding, as opposed to low-level image object classification. In absence of methods to analyze multiple inputs, previous methods adopted temporal mechanisms, including various forms of recurrent neural networks. However, their effective temporal window is local. In addition, they are not a natural choice given the disordered characteristic of photo albums. We address this gap with a tailor-made solution, combining the power of CNNs for image representation and transformers for album representation to perform global reasoning on image collection, offering a practical and efficient solution for photo albums event recognition. Our solution reaches state-of-the-art results on three prominent benchmarks, achieving above 90% mAP on all datasets. We further explore the related image-importance task in event recognition, demonstrating how the learned attentions correlate with the human-annotated importance for this subjective task, thus opening the door for new applications.1
UR - http://www.scopus.com/inward/record.url?scp=85143635347&partnerID=8YFLogxK
U2 - 10.1109/ICPR56361.2022.9956129
DO - 10.1109/ICPR56361.2022.9956129
M3 - منشور من مؤتمر
T3 - Proceedings - International Conference on Pattern Recognition
SP - 2532
EP - 2538
BT - 2022 26th International Conference on Pattern Recognition, ICPR 2022
Y2 - 21 August 2022 through 25 August 2022
ER -