Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.05924 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866929274713276416 |
|---|---|
| author | Zhang, Yanyi Jia, Qi Fan, Xin Liu, Yu He, Ran |
| author_facet | Zhang, Yanyi Jia, Qi Fan, Xin Liu, Yu He, Ran |
| contents | Attribute and object (A-O) disentanglement is a fundamental and critical problem for Compositional Zero-shot Learning (CZSL), whose aim is to recognize novel A-O compositions based on foregone knowledge. Existing methods based on disentangled representation learning lose sight of the contextual dependency between the A-O primitive pairs. Inspired by this, we propose a novel A-O disentangled framework for CZSL, namely Class-specified Cascaded Network (CSCNet). The key insight is to firstly classify one primitive and then specifies the predicted class as a priori for guiding another primitive recognition in a cascaded fashion. To this end, CSCNet constructs Attribute-to-Object and Object-to-Attribute cascaded branches, in addition to a composition branch modeling the two primitives as a whole. Notably, we devise a parametric classifier (ParamCls) to improve the matching between visual and semantic embeddings. By improving the A-O disentanglement, our framework achieves superior results than previous competitive methods. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2403_05924 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | CSCNET: Class-Specified Cascaded Network for Compositional Zero-Shot Learning Zhang, Yanyi Jia, Qi Fan, Xin Liu, Yu He, Ran Computer Vision and Pattern Recognition Attribute and object (A-O) disentanglement is a fundamental and critical problem for Compositional Zero-shot Learning (CZSL), whose aim is to recognize novel A-O compositions based on foregone knowledge. Existing methods based on disentangled representation learning lose sight of the contextual dependency between the A-O primitive pairs. Inspired by this, we propose a novel A-O disentangled framework for CZSL, namely Class-specified Cascaded Network (CSCNet). The key insight is to firstly classify one primitive and then specifies the predicted class as a priori for guiding another primitive recognition in a cascaded fashion. To this end, CSCNet constructs Attribute-to-Object and Object-to-Attribute cascaded branches, in addition to a composition branch modeling the two primitives as a whole. Notably, we devise a parametric classifier (ParamCls) to improve the matching between visual and semantic embeddings. By improving the A-O disentanglement, our framework achieves superior results than previous competitive methods. |
| title | CSCNET: Class-Specified Cascaded Network for Compositional Zero-Shot Learning |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2403.05924 |