Saved in:
| Main Authors: | Li, Tianqin, Zhao, Junru, Jiang, Dunhan, Wu, Shenghao, Ramirez, Alan, Lee, Tai Sing |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.01201 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Learning More by Seeing Less: Structure First Learning for Efficient, Transferable, and Human-Aligned Vision
by: Li, Tianqin, et al.
Published: (2025)
by: Li, Tianqin, et al.
Published: (2025)
Does resistance to style-transfer equal Global Shape Bias? Measuring network sensitivity to global shape configuration
by: Wen, Ziqi, et al.
Published: (2023)
by: Wen, Ziqi, et al.
Published: (2023)
From Local Cues to Global Percepts: Emergent Gestalt Organization in Self-Supervised Vision Models
by: Li, Tianqin, et al.
Published: (2025)
by: Li, Tianqin, et al.
Published: (2025)
In-Place Panoptic Radiance Field Segmentation with Perceptual Prior for 3D Scene Understanding
by: Li, Shenghao
Published: (2024)
by: Li, Shenghao
Published: (2024)
Modeling Rapid Contextual Learning in the Visual Cortex with Fast-Weight Deep Autoencoder Networks
by: Li, Yue, et al.
Published: (2025)
by: Li, Yue, et al.
Published: (2025)
Don't Judge Before You CLIP: A Unified Approach for Perceptual Tasks
by: Zalcher, Amit, et al.
Published: (2025)
by: Zalcher, Amit, et al.
Published: (2025)
Do We Always Need the Simplicity Bias? Looking for Optimal Inductive Biases in the Wild
by: Teney, Damien, et al.
Published: (2025)
by: Teney, Damien, et al.
Published: (2025)
Smart Feature is What You Need
by: Hu, Zhaoxin, et al.
Published: (2024)
by: Hu, Zhaoxin, et al.
Published: (2024)
Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts
by: Kao, Shiu-hong, et al.
Published: (2025)
by: Kao, Shiu-hong, et al.
Published: (2025)
Take Only What You Need: Rank Minimization as an Implicit Forgetting Regularizer in Continual Learning
by: Lu, Haodong, et al.
Published: (2024)
by: Lu, Haodong, et al.
Published: (2024)
SeTformer is What You Need for Vision and Language
by: Shamsolmoali, Pourya, et al.
Published: (2024)
by: Shamsolmoali, Pourya, et al.
Published: (2024)
Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language Tasks
by: Zhang, Yunqi, et al.
Published: (2024)
by: Zhang, Yunqi, et al.
Published: (2024)
Chameleon: Images Are What You Need For Multimodal Learning Robust To Missing Modalities
by: Liaqat, Muhammad Irzam, et al.
Published: (2024)
by: Liaqat, Muhammad Irzam, et al.
Published: (2024)
Think Bright, Diffuse Nice: Enhancing T2I-ICL via Inductive-Bias Hint Instruction and Query Contrastive Decoding
by: Ma, Zhiyong, et al.
Published: (2026)
by: Ma, Zhiyong, et al.
Published: (2026)
Learning to See What You Need: Gaze Attention for Multimodal Large Language Models
by: Song, Junha, et al.
Published: (2026)
by: Song, Junha, et al.
Published: (2026)
Masked Generative Transformer Is What You Need for Image Editing
by: Chow, Wei, et al.
Published: (2026)
by: Chow, Wei, et al.
Published: (2026)
Lite-SAM Is Actually What You Need for Segment Everything
by: Fu, Jianhai, et al.
Published: (2024)
by: Fu, Jianhai, et al.
Published: (2024)
Choose What You Need: Disentangled Representation Learning for Scene Text Recognition, Removal and Editing
by: Zhang, Boqiang, et al.
Published: (2024)
by: Zhang, Boqiang, et al.
Published: (2024)
Look Before You Decide: Prompting Active Deduction of MLLMs for Assumptive Reasoning
by: Li, Yian, et al.
Published: (2024)
by: Li, Yian, et al.
Published: (2024)
Perceptual Classifiers: Detecting Generative Images using Perceptual Features
by: Durbha, Krishna Srikar, et al.
Published: (2025)
by: Durbha, Krishna Srikar, et al.
Published: (2025)
What You Have is What You Track: Adaptive and Robust Multimodal Tracking
by: Tan, Yuedong, et al.
Published: (2025)
by: Tan, Yuedong, et al.
Published: (2025)
Multi-View Representation is What You Need for Point-Cloud Pre-Training
by: Yan, Siming, et al.
Published: (2023)
by: Yan, Siming, et al.
Published: (2023)
See It Before You Grab It: Deep Learning-based Action Anticipation in Basketball
by: Roy, Arnau Barrera, et al.
Published: (2025)
by: Roy, Arnau Barrera, et al.
Published: (2025)
Low-Resolution Editing is All You Need for High-Resolution Editing
by: Lee, Junsung, et al.
Published: (2025)
by: Lee, Junsung, et al.
Published: (2025)
A Strong Inductive Bias: Gzip for binary image classification
by: Scilipoti, Marco, et al.
Published: (2024)
by: Scilipoti, Marco, et al.
Published: (2024)
Face-Voice Association with Inductive Bias for Maximum Class Separation
by: Moscati, Marta, et al.
Published: (2026)
by: Moscati, Marta, et al.
Published: (2026)
ParameterNet: Parameters Are All You Need
by: Han, Kai, et al.
Published: (2023)
by: Han, Kai, et al.
Published: (2023)
Forensic Self-Descriptions Are All You Need for Zero-Shot Detection, Open-Set Source Attribution, and Clustering of AI-generated Images
by: Nguyen, Tai D., et al.
Published: (2025)
by: Nguyen, Tai D., et al.
Published: (2025)
Leveraging Geometric Visual Illusions as Perceptual Inductive Biases for Vision Models
by: Yang, Haobo, et al.
Published: (2025)
by: Yang, Haobo, et al.
Published: (2025)
Beginning with You: Perceptual-Initialization Improves Vision-Language Representation and Alignment
by: Hu, Yang, et al.
Published: (2025)
by: Hu, Yang, et al.
Published: (2025)
Generating 360° Video is What You Need For a 3D Scene
by: Zhang, Zhaoyang, et al.
Published: (2025)
by: Zhang, Zhaoyang, et al.
Published: (2025)
Taxes Are All You Need: Integration of Taxonomical Hierarchy Relationships into the Contrastive Loss
by: Kokilepersaud, Kiran, et al.
Published: (2024)
by: Kokilepersaud, Kiran, et al.
Published: (2024)
Label Critic: Design Data Before Models
by: Bassi, Pedro R. A. S., et al.
Published: (2024)
by: Bassi, Pedro R. A. S., et al.
Published: (2024)
Emu3: Next-Token Prediction is All You Need
by: Wang, Xinlong, et al.
Published: (2024)
by: Wang, Xinlong, et al.
Published: (2024)
BiECVC: Gated Diversification of Bidirectional Contexts for Learned Video Compression
by: Jiang, Wei, et al.
Published: (2025)
by: Jiang, Wei, et al.
Published: (2025)
NijiGAN: Transform What You See into Anime with Contrastive Semi-Supervised Learning and Neural Ordinary Differential Equations
by: Santoso, Kevin Putra, et al.
Published: (2024)
by: Santoso, Kevin Putra, et al.
Published: (2024)
What Happens Before Decoding? Prefill Determines GUI Grounding in VLMs
by: Lin, Jiaping, et al.
Published: (2026)
by: Lin, Jiaping, et al.
Published: (2026)
LVC-LGMC: Joint Local and Global Motion Compensation for Learned Video Compression
by: Jiang, Wei, et al.
Published: (2024)
by: Jiang, Wei, et al.
Published: (2024)
AgenticOCR: Parsing Only What You Need for Efficient Retrieval-Augmented Generation
by: Wang, Zhengren, et al.
Published: (2026)
by: Wang, Zhengren, et al.
Published: (2026)
Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement
by: Yang, Tao, et al.
Published: (2024)
by: Yang, Tao, et al.
Published: (2024)
Similar Items
-
Learning More by Seeing Less: Structure First Learning for Efficient, Transferable, and Human-Aligned Vision
by: Li, Tianqin, et al.
Published: (2025) -
Does resistance to style-transfer equal Global Shape Bias? Measuring network sensitivity to global shape configuration
by: Wen, Ziqi, et al.
Published: (2023) -
From Local Cues to Global Percepts: Emergent Gestalt Organization in Self-Supervised Vision Models
by: Li, Tianqin, et al.
Published: (2025) -
In-Place Panoptic Radiance Field Segmentation with Perceptual Prior for 3D Scene Understanding
by: Li, Shenghao
Published: (2024) -
Modeling Rapid Contextual Learning in the Visual Cortex with Fast-Weight Deep Autoencoder Networks
by: Li, Yue, et al.
Published: (2025)