14:20 〜 14:40
[4K3-IS-2f-02] Generalized Few-Shot Siamese Semantic Segmentation with Pyramid Vision Transformer Backbone
キーワード:Deep Learning, Semantic Segmentation, Generalized Few-shot
Few-shot semantic segmentation enables pre-trained networks to generalize to new data with minimal labelled samples per class, addressing challenges of data scarcity and annotation cost. While few-shot learning methods have shown success, a more practical challenge lies in segmenting both base classes (pre-trained classes) and novel classes (new classes with few examples) in a single task. So, Generalized Few-Shot Semantic Segmentation(GFSS) was introduced, evaluating models on their ability to handle familiar and unseen classes. Existing approaches use VGG and ResNet backbones, but struggle with handling multi-scale features, which is crucial for segmenting varying size objects. Additionally, Siamese learning has proven effective for few-shot tasks but has not been widely explored in generalized few-shot learning. This paper proposes a novel solution by integrating Pyramid Vision Transformer (PVT), which introduces multi-scale features into transformers, with a Siamese Transformer Module(STM) for enhanced adaptation of support features to query features. Our approach aims to improve effectiveness and robustness of GFSS, addressing scale variation challenges and the need for better adaptation to novel class.
Our work aims to:
Show the capabilities of PVT for dense predictions
Extend Siamese networks for GFSS
Our work aims to:
Show the capabilities of PVT for dense predictions
Extend Siamese networks for GFSS
講演PDFパスワード認証
論文PDFの閲覧にはログインが必要です。参加登録者の方は「参加者用ログイン」画面からログインしてください。あるいは論文PDF閲覧用のパスワードを以下にご入力ください。