Saved in:
Bibliographic Details
Main Authors: Li, Suyi, Jiang, Chenyi, Wang, Shidong, Long, Yang, Zhang, Zheng, Zhang, Haofeng
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.14962
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914843729068032
author Li, Suyi
Jiang, Chenyi
Wang, Shidong
Long, Yang
Zhang, Zheng
Zhang, Haofeng
author_facet Li, Suyi
Jiang, Chenyi
Wang, Shidong
Long, Yang
Zhang, Zheng
Zhang, Haofeng
contents Compositional Zero-shot Learning (CZSL) aims to identify novel compositions via known attribute-object pairs. The primary challenge in CZSL tasks lies in the significant discrepancies introduced by the complex interaction between the visual primitives of attribute and object, consequently decreasing the classification performance towards novel compositions. Previous remarkable works primarily addressed this issue by focusing on disentangling strategy or utilizing object-based conditional probabilities to constrain the selection space of attributes. Unfortunately, few studies have explored the problem from the perspective of modeling the mechanism of visual primitive interactions. Inspired by the success of vanilla adversarial learning in Cross-Domain Few-Shot Learning, we take a step further and devise a model-agnostic and Primitive-Based Adversarial training (PBadv) method to deal with this problem. Besides, the latest studies highlight the weakness of the perception of hard compositions even under data-balanced conditions. To this end, we propose a novel over-sampling strategy with object-similarity guidance to augment target compositional training data. We performed detailed quantitative analysis and retrieval experiments on well-established datasets, such as UT-Zappos50K, MIT-States, and C-GQA, to validate the effectiveness of our proposed method, and the state-of-the-art (SOTA) performance demonstrates the superiority of our approach. The code is available at https://github.com/lisuyi/PBadv_czsl.
format Preprint
id arxiv_https___arxiv_org_abs_2406_14962
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Contextual Interaction via Primitive-based Adversarial Training For Compositional Zero-shot Learning
Li, Suyi
Jiang, Chenyi
Wang, Shidong
Long, Yang
Zhang, Zheng
Zhang, Haofeng
Computer Vision and Pattern Recognition
Compositional Zero-shot Learning (CZSL) aims to identify novel compositions via known attribute-object pairs. The primary challenge in CZSL tasks lies in the significant discrepancies introduced by the complex interaction between the visual primitives of attribute and object, consequently decreasing the classification performance towards novel compositions. Previous remarkable works primarily addressed this issue by focusing on disentangling strategy or utilizing object-based conditional probabilities to constrain the selection space of attributes. Unfortunately, few studies have explored the problem from the perspective of modeling the mechanism of visual primitive interactions. Inspired by the success of vanilla adversarial learning in Cross-Domain Few-Shot Learning, we take a step further and devise a model-agnostic and Primitive-Based Adversarial training (PBadv) method to deal with this problem. Besides, the latest studies highlight the weakness of the perception of hard compositions even under data-balanced conditions. To this end, we propose a novel over-sampling strategy with object-similarity guidance to augment target compositional training data. We performed detailed quantitative analysis and retrieval experiments on well-established datasets, such as UT-Zappos50K, MIT-States, and C-GQA, to validate the effectiveness of our proposed method, and the state-of-the-art (SOTA) performance demonstrates the superiority of our approach. The code is available at https://github.com/lisuyi/PBadv_czsl.
title Contextual Interaction via Primitive-based Adversarial Training For Compositional Zero-shot Learning
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2406.14962