JSAI2025

Presentation information

General Session

General Session » GS-7 Vision, speech media processing

[3N5-GS-7] Vision, speech media processing:

Thu. May 29, 2025 3:40 PM - 5:20 PM Room N (Room 1009)

座長:比嘉 恭太(日本電気株式会社)

5:00 PM - 5:20 PM

[3N5-GS-7-05] Implementation and Evaluation of a Flame and Smoke Detection System based on YOLO from Still Images

〇NAN DING1, YIMENG SUN1, Takao Nakaguchi1, Miki Ueno1, Masaharu Imai1 (1. The Kyoto College of Graduate Studies for Informatics)

Keywords:flame detection, smoke detection, YOLO, image recognition, object detection

In recent years, large-scale fires have frequently occurred worldwide, causing severe damage. Early detection of flames and smoke is essential for minimizing fire-related losses.

This study proposes an enhanced fire and smoke detection model based on YOLOv8 to improve detection accuracy. Specifically, part of the C2f-n module in the YOLOv8 backbone is replaced with a Swin Transformer for better multi-scale feature extraction. Deformable Convolutional Networks (DCN) replace standard convolutional layers to enhance adaptability to shape and position variations. Additionally, a Convolutional Block Attention Module (CBAM) is incorporated into the detection head to improve spatial and channel-wise attention. Data augmentation techniques, such as flipping and rotation, are also applied to enhance model generalization.

The evaluation used WSDY, D-fire, and a custom dataset (5,211 images). Experimental results show that the proposed model achieved an mAP50 of 0.795, outperforming YOLOv8’s 0.744, confirming improved detection accuracy. The model demonstrated stable performance in complex backgrounds, indicating its potential for real-time applications.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password