16:00 〜 16:20
[2K5-IS-1b-02] Cross-Modal Fish Species Detection in Underwater Environments with Semantic Guidance
キーワード:Classification, Deep learning, Environment Recognition, GAN Model
This study addresses the challenges of small object detection by proposing a semantics-guided cross-modal method, leveraging natural language processing to assist image recognition, enabling the model to accurately locate and identify small targets. Particularly in underwater fish species recognition, factors such as lighting variations, interference from suspended particles, and high-density fish populations affect the stability of traditional methods. Therefore, this study integrates the BERT pre-trained model with the PRB-FPN-Net image recognition technique, applying Retinex and GAN-based image enhancement to improve image quality, while utilizing semantic annotation to enhance fish species identification. Experimental results demonstrate that the proposed method achieves an accuracy of 73.2% and a recall rate of 60.4%, maintaining stable detection performance across varying lighting and background conditions. In addition, we demonstrate workflow integration of a robotic fish platform for taking the underwater image data and on board inference. Future research will focus on optimizing open-vocabulary recognition and exploring the integration of acoustic data and underwater 3D sensing technologies to enhance ecological monitoring and fish behavior analysis applications.
講演PDFパスワード認証
論文PDFの閲覧にはログインが必要です。参加登録者の方は「参加者用ログイン」画面からログインしてください。あるいは論文PDF閲覧用のパスワードを以下にご入力ください。