Abstract
Video anomaly detection (VAD) identifies suspicious events in videos, which is critical for crime prevention and homeland security. In this paper, we propose a simple but highly effective VAD method that relies on attribute-based representations. The base version of our method represents every object by its velocity and pose, and computes anomaly scores by density estimation. Surprisingly, this simple representation is suffi-cient to achieve state-of-the-art performance in ShanghaiTech, the most commonly used VAD dataset. Combining our attribute-based representations with an off-the-shelf, pre-trained deep representation yields state-of-the-art performance with a 99.1%, 93.7%, and 85.9% AUROC on Ped2, Avenue, and ShanghaiTech, respectively. Our code is available at https://github.com/talreiss/Accurate-Interpretable-VAD.
Original language | English |
---|---|
Journal | Transactions on Machine Learning Research |
Volume | 2025 |
State | Published - 2025 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Computer Vision and Pattern Recognition