:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Oskouie, Haniyeh Ehsani, Moin, Mohammad-Shahram, Kasaei, Shohreh
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Machine Learning Multimedia
Online Access:	https://arxiv.org/abs/2404.13621
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Interpretation of Neural Networks is Susceptible to Universal Adversarial Perturbations
by: Oskouie, Haniyeh Ehsani, et al.
Published: (2022)

MMLoP: Multi-Modal Low-Rank Prompting for Efficient Vision-Language Adaptation
by: Ghiasvand, Sajjad, et al.
Published: (2026)

Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles
by: Lao, Dong, et al.
Published: (2025)

Learning from Mistakes: Self-Regularizing Hierarchical Representations in Point Cloud Semantic Segmentation
by: Camuffo, Elena, et al.
Published: (2023)

Understanding Key Point Cloud Features for Development Three-dimensional Adversarial Attacks
by: Naderi, Hanieh, et al.
Published: (2022)

Evaluating the Impact of Point Cloud Colorization on Semantic Segmentation Accuracy
by: Zhu, Qinfeng, et al.
Published: (2024)

BRep Boundary and Junction Detection for CAD Reverse Engineering
by: Ali, Sk Aziz, et al.
Published: (2024)

Spatial Visibility and Temporal Dynamics: Revolutionizing Field of View Prediction in Adaptive Point Cloud Video Streaming
by: Li, Chen, et al.
Published: (2024)

Detection of Cyberbullying in GIF using AI
by: Dave, Pal, et al.
Published: (2025)

Omnidirectional Video Super-Resolution using Deep Learning
by: Baniya, Arbind Agrahari, et al.
Published: (2025)

Parallel Backpropagation for Inverse of a Convolution with Application to Normalizing Flows
by: Nagar, Sandeep, et al.
Published: (2024)

Flow Generator Matching
by: Huang, Zemin, et al.
Published: (2024)

BadCM: Invisible Backdoor Attack Against Cross-Modal Learning
by: Zhang, Zheng, et al.
Published: (2024)

RAVEN: Query-Guided Representation Alignment for Question Answering over Audio, Video, Embedded Sensors, and Natural Language
by: Biswas, Subrata, et al.
Published: (2025)

TbExplain: A Text-based Explanation Method for Scene Classification Models with the Statistical Prediction Correction
by: Aminimehr, Amirhossein, et al.
Published: (2023)

NeRFlex: Resource-aware Real-time High-quality Rendering of Complex Scenes on Mobile Devices
by: Wang, Zhe, et al.
Published: (2025)

On-the-Fly Object-aware Representative Point Selection in Point Cloud
by: Zhang, Xiaoyu, et al.
Published: (2025)

Efficient and Generic Point Model for Lossless Point Cloud Attribute Compression
by: You, Kang, et al.
Published: (2024)

Efficient Bitrate Ladder Construction using Transfer Learning and Spatio-Temporal Features
by: Falahati, Ali, et al.
Published: (2024)

Neuron Abandoning Attention Flow: Visual Explanation of Dynamics inside CNN Models
by: Liao, Yi, et al.
Published: (2024)

Cross-Modal Coordination Across a Diverse Set of Input Modalities
by: Sánchez, Jorge, et al.
Published: (2024)

Regularized Contrastive Partial Multi-view Outlier Detection
by: Wang, Yijia, et al.
Published: (2024)

Improving Accuracy and Generalization for Efficient Visual Tracking
by: Zaveri, Ram, et al.
Published: (2024)

Generalized Jersey Number Recognition Using Multi-task Learning With Orientation-guided Weight Refinement
by: Lin, Yung-Hui, et al.
Published: (2024)

Relating CNN-Transformer Fusion Network for Change Detection
by: Gao, Yuhao, et al.
Published: (2024)

LinVT: Empower Your Image-level Large Language Model to Understand Videos
by: Gao, Lishuai, et al.
Published: (2024)

Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines
by: Zhang, Honglei, et al.
Published: (2024)

MTCAE-DFER: Multi-Task Cascaded Autoencoder for Dynamic Facial Expression Recognition
by: Xiang, Peihao, et al.
Published: (2024)

Bridging Compressed Image Latents and Multimodal Large Language Models
by: Kao, Chia-Hao, et al.
Published: (2024)

Adversarially Robust Deepfake Detection via Adversarial Feature Similarity Learning
by: Khan, Sarwar
Published: (2024)

X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
by: Swetha, Sirnam, et al.
Published: (2024)

CinePile: A Long Video Question Answering Dataset and Benchmark
by: Rawal, Ruchit, et al.
Published: (2024)

360VFI: A Dataset and Benchmark for Omnidirectional Video Frame Interpolation
by: Lu, Wenxuan, et al.
Published: (2024)

Multimodal Transformer With a Low-Computational-Cost Guarantee
by: Park, Sungjin, et al.
Published: (2024)

Deep Learning-based Text-in-Image Watermarking
by: Karki, Bishwa, et al.
Published: (2024)

MIP-GAF: A MLLM-annotated Benchmark for Most Important Person Localization and Group Context Understanding
by: Madan, Surbhi, et al.
Published: (2024)

Improving Long-Text Alignment for Text-to-Image Diffusion Models
by: Liu, Luping, et al.
Published: (2024)

Storybooth: Training-free Multi-Subject Consistency for Improved Visual Storytelling
by: Singh, Jaskirat, et al.
Published: (2025)

Zero-shot image privacy classification with Vision-Language Models
by: Baia, Alina Elena, et al.
Published: (2025)

PMPGuard: Catching Pseudo-Matched Pairs in Remote Sensing Image-Text Retrieval
by: Ouyang, Pengxiang, et al.
Published: (2025)