Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lian, Junbo Jacob, Xiong, Feng, Sun, Yujun, Ouyang, Kaichen, Ke, Zong, Yu, Mingyang, Fu, Shengwei, Rui, Zhong, Yujun, Zhang, Chen, Huiling
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.07262
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915974887768064
author	Lian, Junbo Jacob Xiong, Feng Sun, Yujun Ouyang, Kaichen Ke, Zong Yu, Mingyang Fu, Shengwei Rui, Zhong Yujun, Zhang Chen, Huiling
author_facet	Lian, Junbo Jacob Xiong, Feng Sun, Yujun Ouyang, Kaichen Ke, Zong Yu, Mingyang Fu, Shengwei Rui, Zhong Yujun, Zhang Chen, Huiling
contents	Second-order feature statistics are central to texture recognition, yet existing mechanisms exhibit a structural tension: bilinear pooling and Gram matrices capture global channel correlations but discard spatial structure, whereas self-attention models capture cross-position relations through weighted sums rather than explicit pairwise products. We propose TwistNet-2D, a lightweight module that computes local pairwise channel products under directional spatial displacement, jointly encoding where features co-occur and how they interact. The core component, Spiral-Twisted Channel Interaction (STCI), shifts one feature map along a prescribed direction before L2-normalized channel multiplication, capturing cross-position co-occurrence patterns that characterize structured and periodic textures. Four directional heads are aggregated through content-adaptive channel reweighting, and the result is injected via a sigmoid-gated residual path with near-zero initialization. TwistNet-2D adds only approximately 3.5% parameters and approximately 2% FLOPs over ResNet-18. To isolate the contribution of architectural inductive bias from that of transfer learning, all models in this study are trained from scratch without ImageNet pretraining. Under this protocol, TwistNet-2D consistently surpasses parameter-matched baselines and substantially larger ConvNeXt and Swin Transformer backbones across four texture and fine-grained recognition benchmarks, while the multi-head structure produces interpretable, orientation-selective representations that align with classical texture analysis.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_07262
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	TwistNet-2D: Learning Second-Order Channel Interactions via Spiral Twisting for Texture Recognition Lian, Junbo Jacob Xiong, Feng Sun, Yujun Ouyang, Kaichen Ke, Zong Yu, Mingyang Fu, Shengwei Rui, Zhong Yujun, Zhang Chen, Huiling Computer Vision and Pattern Recognition Second-order feature statistics are central to texture recognition, yet existing mechanisms exhibit a structural tension: bilinear pooling and Gram matrices capture global channel correlations but discard spatial structure, whereas self-attention models capture cross-position relations through weighted sums rather than explicit pairwise products. We propose TwistNet-2D, a lightweight module that computes local pairwise channel products under directional spatial displacement, jointly encoding where features co-occur and how they interact. The core component, Spiral-Twisted Channel Interaction (STCI), shifts one feature map along a prescribed direction before L2-normalized channel multiplication, capturing cross-position co-occurrence patterns that characterize structured and periodic textures. Four directional heads are aggregated through content-adaptive channel reweighting, and the result is injected via a sigmoid-gated residual path with near-zero initialization. TwistNet-2D adds only approximately 3.5% parameters and approximately 2% FLOPs over ResNet-18. To isolate the contribution of architectural inductive bias from that of transfer learning, all models in this study are trained from scratch without ImageNet pretraining. Under this protocol, TwistNet-2D consistently surpasses parameter-matched baselines and substantially larger ConvNeXt and Swin Transformer backbones across four texture and fine-grained recognition benchmarks, while the multi-head structure produces interpretable, orientation-selective representations that align with classical texture analysis.
title	TwistNet-2D: Learning Second-Order Channel Interactions via Spiral Twisting for Texture Recognition
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2602.07262

Similar Items