Saved in:
| Main Authors: | Hu, Zizhao, Zhou, Xiaolin, Rostami, Mohammad |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.07049 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
by: Hu, Zizhao, et al.
Published: (2024)
by: Hu, Zizhao, et al.
Published: (2024)
An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models
by: Hu, Zizhao, et al.
Published: (2024)
by: Hu, Zizhao, et al.
Published: (2024)
Unsupervised Domain Adaptation Using Compact Internal Representations
by: Rostami, Mohammad
Published: (2024)
by: Rostami, Mohammad
Published: (2024)
A New Class Biorthogonal Spline Wavelet for Image Edge Detection
by: Zhou, Dujuan, et al.
Published: (2024)
by: Zhou, Dujuan, et al.
Published: (2024)
Continuous Unsupervised Domain Adaptation Using Stabilized Representations and Experience Replay
by: Rostami, Mohammad
Published: (2024)
by: Rostami, Mohammad
Published: (2024)
Online Continual Domain Adaptation for Semantic Image Segmentation Using Internal Representations
by: Stan, Serban, et al.
Published: (2024)
by: Stan, Serban, et al.
Published: (2024)
Dynamic Transformer Architecture for Continual Learning of Multimodal Tasks
by: Cai, Yuliang, et al.
Published: (2024)
by: Cai, Yuliang, et al.
Published: (2024)
Cross-Domain Distribution Alignment for Segmentation of Private Unannotated 3D Medical Images
by: Sun, Ruitong, et al.
Published: (2024)
by: Sun, Ruitong, et al.
Published: (2024)
Relating Events and Frames Based on Self-Supervised Learning and Uncorrelated Conditioning for Unsupervised Domain Adaptation
by: Rostami, Mohammad, et al.
Published: (2024)
by: Rostami, Mohammad, et al.
Published: (2024)
Cross-domain Multi-modal Few-shot Object Detection via Rich Text
by: Shangguan, Zeyu, et al.
Published: (2024)
by: Shangguan, Zeyu, et al.
Published: (2024)
CluMo: Cluster-based Modality Fusion Prompt for Continual Learning in Visual Question Answering
by: Cai, Yuliang, et al.
Published: (2024)
by: Cai, Yuliang, et al.
Published: (2024)
DynRsl-VLM: Enhancing Autonomous Driving Perception with Dynamic Resolution Vision-Language Models
by: Zhou, Xirui, et al.
Published: (2025)
by: Zhou, Xirui, et al.
Published: (2025)
Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention
by: Li, Kai, et al.
Published: (2025)
by: Li, Kai, et al.
Published: (2025)
From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects
by: Li, Zizhao, et al.
Published: (2024)
by: Li, Zizhao, et al.
Published: (2024)
Attention Is not Everything: Efficient Alternatives for Vision
by: Kazi, Nur Mohammad, et al.
Published: (2026)
by: Kazi, Nur Mohammad, et al.
Published: (2026)
TNG-CLIP:Training-Time Negation Data Generation for Negation Awareness of CLIP
by: Cai, Yuliang, et al.
Published: (2025)
by: Cai, Yuliang, et al.
Published: (2025)
Cross-domain Few-shot Object Detection with Multi-modal Textual Enrichment
by: Shangguan, Zeyu, et al.
Published: (2025)
by: Shangguan, Zeyu, et al.
Published: (2025)
SRMambaV2: Biomimetic Attention for Sparse Point Cloud Upsampling in Autonomous Driving
by: Chen, Chuang, et al.
Published: (2025)
by: Chen, Chuang, et al.
Published: (2025)
HDBFormer: Efficient RGB-D Semantic Segmentation with A Heterogeneous Dual-Branch Framework
by: Wei, Shuobin, et al.
Published: (2025)
by: Wei, Shuobin, et al.
Published: (2025)
Receptive Field Expanded Look-Up Tables for Vision Inference: Advancing from Low-level to High-level Tasks
by: Zhang, Xi, et al.
Published: (2025)
by: Zhang, Xi, et al.
Published: (2025)
HEART-VIT: Hessian-Guided Efficient Dynamic Attention and Token Pruning in Vision Transformer
by: Uddin, Mohammad Helal, et al.
Published: (2025)
by: Uddin, Mohammad Helal, et al.
Published: (2025)
Unsupervised Federated Domain Adaptation for Segmentation of MRI Images
by: Nananukul, Navapat, et al.
Published: (2024)
by: Nananukul, Navapat, et al.
Published: (2024)
Semi-Supervised Masked Autoencoders: Unlocking Vision Transformer Potential with Limited Data
by: Faysal, Atik, et al.
Published: (2026)
by: Faysal, Atik, et al.
Published: (2026)
A Vision-Centric Approach for Static Map Element Annotation
by: Zhang, Jiaxin, et al.
Published: (2023)
by: Zhang, Jiaxin, et al.
Published: (2023)
MiVE: Multiscale Vision-language features for reference-guided video Editing
by: Wang, Tong, et al.
Published: (2026)
by: Wang, Tong, et al.
Published: (2026)
Out-of-distribution detection in 3D applications: a review
by: Li, Zizhao, et al.
Published: (2025)
by: Li, Zizhao, et al.
Published: (2025)
ScalableMap: Scalable Map Learning for Online Long-Range Vectorized HD Map Construction
by: Yu, Jingyi, et al.
Published: (2023)
by: Yu, Jingyi, et al.
Published: (2023)
SG-LDM: Semantic-Guided LiDAR Generation via Latent-Aligned Diffusion
by: Xiang, Zhengkang, et al.
Published: (2025)
by: Xiang, Zhengkang, et al.
Published: (2025)
Curvature Diversity-Driven Deformation and Domain Alignment for Point Cloud
by: Wu, Mengxi, et al.
Published: (2024)
by: Wu, Mengxi, et al.
Published: (2024)
FGNet: Leveraging Feature-Guided Attention to Refine SAM2 for 3D EM Neuron Segmentation
by: Li, Zhenghua, et al.
Published: (2025)
by: Li, Zhenghua, et al.
Published: (2025)
Unsupervised Monocular Road Segmentation for Autonomous Driving via Scene Geometry
by: Rostami, Sara Hatami, et al.
Published: (2025)
by: Rostami, Sara Hatami, et al.
Published: (2025)
Representative Attention For Vision Transformers
by: Li, Yuntong, et al.
Published: (2026)
by: Li, Yuntong, et al.
Published: (2026)
Vision Transformers with Hierarchical Attention
by: Liu, Yun, et al.
Published: (2021)
by: Liu, Yun, et al.
Published: (2021)
CAMAv2: A Vision-Centric Approach for Static Map Element Annotation
by: Chen, Shiyuan, et al.
Published: (2024)
by: Chen, Shiyuan, et al.
Published: (2024)
Static for Dynamic: Towards a Deeper Understanding of Dynamic Facial Expressions Using Static Expression Data
by: Chen, Yin, et al.
Published: (2024)
by: Chen, Yin, et al.
Published: (2024)
Pay Attention to the Keys: Visual Piano Transcription Using Transformers
by: Zivanovic, Uros, et al.
Published: (2024)
by: Zivanovic, Uros, et al.
Published: (2024)
Beyond Static Frames: Temporal Aggregate-and-Restore Vision Transformer for Human Pose Estimation
by: Fang, Hongwei, et al.
Published: (2026)
by: Fang, Hongwei, et al.
Published: (2026)
Learning Weakly Supervised Audio-Visual Violence Detection in Hyperbolic Space
by: Peng, Xiaogang, et al.
Published: (2023)
by: Peng, Xiaogang, et al.
Published: (2023)
Structured Initialization for Attention in Vision Transformers
by: Zheng, Jianqiao, et al.
Published: (2024)
by: Zheng, Jianqiao, et al.
Published: (2024)
Vision Transformers are Circulant Attention Learners
by: Han, Dongchen, et al.
Published: (2025)
by: Han, Dongchen, et al.
Published: (2025)
Similar Items
-
Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
by: Hu, Zizhao, et al.
Published: (2024) -
An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models
by: Hu, Zizhao, et al.
Published: (2024) -
Unsupervised Domain Adaptation Using Compact Internal Representations
by: Rostami, Mohammad
Published: (2024) -
A New Class Biorthogonal Spline Wavelet for Image Edge Detection
by: Zhou, Dujuan, et al.
Published: (2024) -
Continuous Unsupervised Domain Adaptation Using Stabilized Representations and Experience Replay
by: Rostami, Mohammad
Published: (2024)