6:30 PM - 6:50 PM
[2N6-GS-10-04] Validation of the application of a multimodal model in the detection of illegal dumping in rivers
Keywords:river patrol, classification, UAV, Multi Modal
Recently the river management policy in Japan has progressed the river policy initiative toward auto-patrol using drone surveillance and computer vision technique, due to shortage of qualified workers. If we automate to detect in river patrols, the prediction errors could happen in object identification, so that the illegal judgment not always corresponds to subtle nuances stipulated in the patrol regulations.
It is the most appropriate method to detect illegal objects as riverine class labels: dumping and structures, and to classify the detected region into multiple classes. Furthermore, we can automate to assign the detected region to illegal objects according to the classified outputs.
However, it is difficult to build an illegal object detection model because we could not completely define all of classes with respect to riverine objects.
This paper proposes an application for auto-patrol task using text-image multimodal prediction based on a foundation model: the CLIP that incorporates the mapping between riverine text and drone patrol images so as to tag illegal objects. We demonstrate our method to a dataset of drone patrol images, and implement experimental studies to evaluate the test prediction.
It is the most appropriate method to detect illegal objects as riverine class labels: dumping and structures, and to classify the detected region into multiple classes. Furthermore, we can automate to assign the detected region to illegal objects according to the classified outputs.
However, it is difficult to build an illegal object detection model because we could not completely define all of classes with respect to riverine objects.
This paper proposes an application for auto-patrol task using text-image multimodal prediction based on a foundation model: the CLIP that incorporates the mapping between riverine text and drone patrol images so as to tag illegal objects. We demonstrate our method to a dataset of drone patrol images, and implement experimental studies to evaluate the test prediction.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.