Saved in:
| Main Authors: | Wang, Songping, Liu, Hanqing, Lyu, Yueming, Hu, Xiantao, He, Ziwen, Wang, Wei, Shan, Caifeng, Wang, Liang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.14921 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
An Effective End-to-End Solution for Multimodal Action Recognition
by: Wang, Songping, et al.
Published: (2025)
by: Wang, Songping, et al.
Published: (2025)
Exploring Adversarial Transferability between Kolmogorov-arnold Networks
by: Wang, Songping, et al.
Published: (2025)
by: Wang, Songping, et al.
Published: (2025)
Exposing and Defending the Achilles' Heel of Video Mixture-of-Experts
by: Wang, Songping, et al.
Published: (2026)
by: Wang, Songping, et al.
Published: (2026)
RunawayEvil: Jailbreaking the Image-to-Video Generative Models
by: Wang, Songping, et al.
Published: (2025)
by: Wang, Songping, et al.
Published: (2025)
Anti-Aesthetics: Protecting Facial Privacy against Customized Text-to-Image Synthesis
by: Wang, Songping, et al.
Published: (2025)
by: Wang, Songping, et al.
Published: (2025)
GOOD: Training-Free Guided Diffusion Sampling for Out-of-Distribution Detection
by: Gao, Xin, et al.
Published: (2025)
by: Gao, Xin, et al.
Published: (2025)
Adversarially Masked Video Consistency for Unsupervised Domain Adaptation
by: Zhu, Xiaoyu, et al.
Published: (2024)
by: Zhu, Xiaoyu, et al.
Published: (2024)
One-to-More: High-Fidelity Training-Free Anomaly Generation with Attention Control
by: Rao, Haoxiang, et al.
Published: (2026)
by: Rao, Haoxiang, et al.
Published: (2026)
Low-Light Video Enhancement via Spatial-Temporal Consistent Decomposition
by: Xu, Xiaogang, et al.
Published: (2024)
by: Xu, Xiaogang, et al.
Published: (2024)
Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models
by: Liang, Hanwen, et al.
Published: (2024)
by: Liang, Hanwen, et al.
Published: (2024)
Evaluating Adversarial Robustness in the Spatial Frequency Domain
by: Liao, Keng-Hsin, et al.
Published: (2024)
by: Liao, Keng-Hsin, et al.
Published: (2024)
VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding
by: Yang, Ruoliu, et al.
Published: (2026)
by: Yang, Ruoliu, et al.
Published: (2026)
Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency
by: Wang, Yutong, et al.
Published: (2024)
by: Wang, Yutong, et al.
Published: (2024)
DeltaSpace: A Semantic-aligned Feature Space for Flexible Text-guided Image Editing
by: Lyu, Yueming, et al.
Published: (2023)
by: Lyu, Yueming, et al.
Published: (2023)
ABounD: Adversarial Boundary-Driven Few-Shot Learning for Multi-Class Anomaly Detection
by: Deng, Runzhi, et al.
Published: (2025)
by: Deng, Runzhi, et al.
Published: (2025)
Uncertainty-Aware Concept and Motion Segmentation for Semi-Supervised Angiography Videos
by: Luo, Yu, et al.
Published: (2026)
by: Luo, Yu, et al.
Published: (2026)
Optimize-at-Capture: Highly-adaptive Exposure Controlling for In-Vehicle Non-contact Heart-rate Monitoring
by: Wang, Jieying, et al.
Published: (2026)
by: Wang, Jieying, et al.
Published: (2026)
PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design
by: Wei, Jiazhe, et al.
Published: (2025)
by: Wei, Jiazhe, et al.
Published: (2025)
Consistent Human Image and Video Generation with Spatially Conditioned Diffusion
by: Cao, Mingdeng, et al.
Published: (2024)
by: Cao, Mingdeng, et al.
Published: (2024)
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors
by: Bin, Yanrui, et al.
Published: (2025)
by: Bin, Yanrui, et al.
Published: (2025)
Salt: Self-Consistent Distribution Matching with Cache-Aware Training for Fast Video Generation
by: Ge, Xingtong, et al.
Published: (2026)
by: Ge, Xingtong, et al.
Published: (2026)
Leveraging Information Consistency in Frequency and Spatial Domain for Adversarial Attacks
by: Jin, Zhibo, et al.
Published: (2024)
by: Jin, Zhibo, et al.
Published: (2024)
Spatial-Frequency Discriminability for Revealing Adversarial Perturbations
by: Wang, Chao, et al.
Published: (2023)
by: Wang, Chao, et al.
Published: (2023)
Counterfactual Explanations for Face Forgery Detection via Adversarial Removal of Artifacts
by: Li, Yang, et al.
Published: (2024)
by: Li, Yang, et al.
Published: (2024)
FastInit: Fast Noise Initialization for Temporally Consistent Video Generation
by: Bai, Chengyu, et al.
Published: (2025)
by: Bai, Chengyu, et al.
Published: (2025)
Spatial-Temporal Decoupled Reference Conditioning for Identity-Preserving Text-to-Video Generation
by: Chen, Yuheng, et al.
Published: (2026)
by: Chen, Yuheng, et al.
Published: (2026)
Frequency-Guided Diffusion Model with Perturbation Training for Skeleton-Based Video Anomaly Detection
by: Tan, Xiaofeng, et al.
Published: (2024)
by: Tan, Xiaofeng, et al.
Published: (2024)
Self-Supervised Representation Learning with Spatial-Temporal Consistency for Sign Language Recognition
by: Zhao, Weichao, et al.
Published: (2024)
by: Zhao, Weichao, et al.
Published: (2024)
Self-Attentive Spatio-Temporal Calibration for Precise Intermediate Layer Matching in ANN-to-SNN Distillation
by: Hong, Di, et al.
Published: (2025)
by: Hong, Di, et al.
Published: (2025)
VideoCompressa: Data-Efficient Video Understanding via Joint Temporal Compression and Spatial Reconstruction
by: Wang, Shaobo, et al.
Published: (2025)
by: Wang, Shaobo, et al.
Published: (2025)
Mixture of Weak & Strong Experts on Graphs
by: Zeng, Hanqing, et al.
Published: (2023)
by: Zeng, Hanqing, et al.
Published: (2023)
InstaVSR: Taming Diffusion for Efficient and Temporally Consistent Video Super-Resolution
by: Hu, Jintong, et al.
Published: (2026)
by: Hu, Jintong, et al.
Published: (2026)
Robust Alignment: Harmonizing Clean Accuracy and Adversarial Robustness in Adversarial Training
by: Wang, Yanyun, et al.
Published: (2026)
by: Wang, Yanyun, et al.
Published: (2026)
VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment
by: Cong, Wenyan, et al.
Published: (2025)
by: Cong, Wenyan, et al.
Published: (2025)
Temporal-Consistent Video Restoration with Pre-trained Diffusion Models
by: Wang, Hengkang, et al.
Published: (2025)
by: Wang, Hengkang, et al.
Published: (2025)
VideoLoom: A Video Large Language Model for Joint Spatial-Temporal Understanding
by: Shi, Jiapeng, et al.
Published: (2026)
by: Shi, Jiapeng, et al.
Published: (2026)
DirectSwap: Mask-Free Cross-Identity Training and Benchmarking for Expression-Consistent Video Head Swapping
by: Wang, Yanan, et al.
Published: (2025)
by: Wang, Yanan, et al.
Published: (2025)
Weak to Strong: VLM-Based Pseudo-Labeling as a Weakly Supervised Training Strategy in Multimodal Video-based Hidden Emotion Understanding Tasks
by: Wang, Yufei, et al.
Published: (2026)
by: Wang, Yufei, et al.
Published: (2026)
Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking
by: Hu, Xiantao, et al.
Published: (2024)
by: Hu, Xiantao, et al.
Published: (2024)
Dual Frequency Branch Framework with Reconstructed Sliding Windows Attention for AI-Generated Image Detection
by: Yan, Jiazhen, et al.
Published: (2025)
by: Yan, Jiazhen, et al.
Published: (2025)
Similar Items
-
An Effective End-to-End Solution for Multimodal Action Recognition
by: Wang, Songping, et al.
Published: (2025) -
Exploring Adversarial Transferability between Kolmogorov-arnold Networks
by: Wang, Songping, et al.
Published: (2025) -
Exposing and Defending the Achilles' Heel of Video Mixture-of-Experts
by: Wang, Songping, et al.
Published: (2026) -
RunawayEvil: Jailbreaking the Image-to-Video Generative Models
by: Wang, Songping, et al.
Published: (2025) -
Anti-Aesthetics: Protecting Facial Privacy against Customized Text-to-Image Synthesis
by: Wang, Songping, et al.
Published: (2025)