6:30 PM - 6:50 PM
[2C6-GS-7-04] Person-ReID: What is the Deep Learning Model looking at?
Keywords:Person Re-Identification, Grad-CAM, Vision Transformer, CNN
Person Re-identification (Re-ID) is a crucial element within automatic visual surveillance systems, with the aim of automatically identifying and locating individuals in a multicamera network. Because the appearance of pedestrians varies significantly between different cameras in this task, a number of models have been proposed that achieve high accuracy on existing benchmark datasets; however, they are still far from being applicable to real-world environments. Addressing this issue requires gaining insight into the behavior of the black box in the deep-learning model. In this study, we trained CNN and Vision Transformer models on the DukeMTMC-ReID and performed cross-domain evaluation on the Market1501 and CUHK03 the trained models. The results revealed that the Vision Transformer outperformed CNN in terms of accuracy. To demonstrate the stability of the Vision Transformer model, we employed Grad-CAM for visualization. This visualization confirmed the superior stability of the Vision Transformer model, as it focused on the specific features of the person and their interrelationships, avoiding distractions from the background.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.