Saved in:
| Main Authors: | Li, Yiyue, Zhang, Shaoting, Li, Kang, Lao, Qicheng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.01201 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DiffYOLO: Object Detection for Anti-Noise via YOLO and Diffusion Models
by: Liu, Yichen, et al.
Published: (2024)
by: Liu, Yichen, et al.
Published: (2024)
Unsupervised Anomaly Detection Using Diffusion Trend Analysis for Display Inspection
by: Kim, Eunwoo, et al.
Published: (2024)
by: Kim, Eunwoo, et al.
Published: (2024)
Synthetic Industrial Object Detection: GenAI vs. Feature-Based Methods
by: Araya-Martinez, Jose Moises, et al.
Published: (2025)
by: Araya-Martinez, Jose Moises, et al.
Published: (2025)
UrbanAlign: Post-hoc Semantic Calibration for VLM-Human Preference Alignment
by: Zhang, Yecheng, et al.
Published: (2026)
by: Zhang, Yecheng, et al.
Published: (2026)
Revisiting Energy-Based Model for Out-of-Distribution Detection
by: Wu, Yifan, et al.
Published: (2024)
by: Wu, Yifan, et al.
Published: (2024)
SurgVLM: A Large Vision-Language Model and Systematic Evaluation Benchmark for Surgical Intelligence
by: Zeng, Zhitao, et al.
Published: (2025)
by: Zeng, Zhitao, et al.
Published: (2025)
Semantic Prioritization in Visual Counterfactual Explanations with Weighted Segmentation and Auto-Adaptive Region Selection
by: Zhang, Lintong, et al.
Published: (2025)
by: Zhang, Lintong, et al.
Published: (2025)
Learning Association via Track-Detection Matching for Multi-Object Tracking
by: Adžemović, Momir
Published: (2025)
by: Adžemović, Momir
Published: (2025)
ClustViT: Clustering-based Token Merging for Semantic Segmentation
by: Montello, Fabio, et al.
Published: (2025)
by: Montello, Fabio, et al.
Published: (2025)
Zero-Shot Multi-Criteria Visual Quality Inspection for Semi-Controlled Industrial Environments via Real-Time 3D Digital Twin Simulation
by: Araya-Martinez, Jose Moises, et al.
Published: (2025)
by: Araya-Martinez, Jose Moises, et al.
Published: (2025)
Multi-scale Temporal Prediction via Incremental Generation and Multi-agent Collaboration
by: Zeng, Zhitao, et al.
Published: (2025)
by: Zeng, Zhitao, et al.
Published: (2025)
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think
by: Tian, Jie, et al.
Published: (2025)
by: Tian, Jie, et al.
Published: (2025)
Deep Learning Approaches for Human Action Recognition in Video Data
by: Xie, Yufei
Published: (2024)
by: Xie, Yufei
Published: (2024)
LatentForensics: Towards frugal deepfake detection in the StyleGAN latent space
by: Delmas, Matthieu, et al.
Published: (2023)
by: Delmas, Matthieu, et al.
Published: (2023)
PlaneSAM: Multimodal Plane Instance Segmentation Using the Segment Anything Model
by: Deng, Zhongchen, et al.
Published: (2024)
by: Deng, Zhongchen, et al.
Published: (2024)
SynthRender and IRIS: Open-Source Framework and Dataset for Bidirectional Sim-Real Transfer in Industrial Object Perception
by: Araya-Martinez, Jose Moises, et al.
Published: (2026)
by: Araya-Martinez, Jose Moises, et al.
Published: (2026)
Video-CoE: Reinforcing Video Event Prediction via Chain of Events
by: Su, Qile, et al.
Published: (2026)
by: Su, Qile, et al.
Published: (2026)
VA-$π$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation
by: Liao, Xinyao, et al.
Published: (2025)
by: Liao, Xinyao, et al.
Published: (2025)
Parking Space Detection in the City of Granada
by: Luis, Crespo-Orti, et al.
Published: (2025)
by: Luis, Crespo-Orti, et al.
Published: (2025)
DOD-SA: Infrared-Visible Decoupled Object Detection with Single-Modality Annotations
by: Jin, Hang, et al.
Published: (2025)
by: Jin, Hang, et al.
Published: (2025)
Learning Discriminative Spatio-temporal Representations for Semi-supervised Action Recognition
by: Wang, Yu, et al.
Published: (2024)
by: Wang, Yu, et al.
Published: (2024)
Smelly, dense, and spreaded: The Object Detection for Olfactory References (ODOR) dataset
by: Zinnen, Mathias, et al.
Published: (2025)
by: Zinnen, Mathias, et al.
Published: (2025)
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation
by: Allakhverdov, Eduard, et al.
Published: (2025)
by: Allakhverdov, Eduard, et al.
Published: (2025)
Image Reconstruction as a Tool for Feature Analysis
by: Allakhverdov, Eduard, et al.
Published: (2025)
by: Allakhverdov, Eduard, et al.
Published: (2025)
Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models
by: Li, Jinhao, et al.
Published: (2024)
by: Li, Jinhao, et al.
Published: (2024)
ARTPS: Depth-Enhanced Hybrid Anomaly Detection and Learnable Curiosity Score for Autonomous Rover Target Prioritization
by: Baydemir, Poyraz
Published: (2025)
by: Baydemir, Poyraz
Published: (2025)
TRACES: Temporal Recall with Contextual Embeddings for Real-Time Video Anomaly Detection
by: Siddiqui, Yousuf Ahmed, et al.
Published: (2025)
by: Siddiqui, Yousuf Ahmed, et al.
Published: (2025)
Hierarchical Point-Patch Fusion with Adaptive Patch Codebook for 3D Shape Anomaly Detection
by: Kang, Xueyang, et al.
Published: (2026)
by: Kang, Xueyang, et al.
Published: (2026)
Sequence Matters: Harnessing Video Models in 3D Super-Resolution
by: Ko, Hyun-kyu, et al.
Published: (2024)
by: Ko, Hyun-kyu, et al.
Published: (2024)
Attend, Distill, Detect: Attention-aware Entropy Distillation for Anomaly Detection
by: Jena, Sushovan, et al.
Published: (2024)
by: Jena, Sushovan, et al.
Published: (2024)
A Hierarchically Feature Reconstructed Autoencoder for Unsupervised Anomaly Detection
by: Chen, Honghui, et al.
Published: (2024)
by: Chen, Honghui, et al.
Published: (2024)
From Gaze to Insight: Bridging Human Visual Attention and Vision Language Model Explanation for Weakly-Supervised Medical Image Segmentation
by: Chen, Jingkun, et al.
Published: (2025)
by: Chen, Jingkun, et al.
Published: (2025)
On the Inherent Robustness of One-Stage Object Detection against Out-of-Distribution Data
by: Martinez-Seras, Aitor, et al.
Published: (2024)
by: Martinez-Seras, Aitor, et al.
Published: (2024)
DSER: Spectral Epipolar Representation for Efficient Light Field Depth Estimation
by: Mohammad, Noor Islam S., et al.
Published: (2025)
by: Mohammad, Noor Islam S., et al.
Published: (2025)
Hierarchical Spatial Algorithms for High-Resolution Image Quantization and Feature Extraction
by: Mohammad, Noor Islam S.
Published: (2025)
by: Mohammad, Noor Islam S.
Published: (2025)
VLM-NCD:Novel Class Discovery with Vision-Based Large Language Models
by: Su, Yuetong, et al.
Published: (2025)
by: Su, Yuetong, et al.
Published: (2025)
A Survey on Dynamic Neural Networks: from Computer Vision to Multi-modal Sensor Fusion
by: Montello, Fabio, et al.
Published: (2025)
by: Montello, Fabio, et al.
Published: (2025)
VDPP: Video Depth Post-Processing for Speed and Scalability
by: Yoon, Daewon, et al.
Published: (2026)
by: Yoon, Daewon, et al.
Published: (2026)
Do Generative Metrics Predict YOLO Performance? An Evaluation Across Models, Augmentation Ratios, and Dataset Complexity
by: Marian, Vasile, et al.
Published: (2026)
by: Marian, Vasile, et al.
Published: (2026)
Quantifying and Narrowing the Unknown: Interactive Text-to-Video Retrieval via Uncertainty Minimization
by: Zhang, Bingqing, et al.
Published: (2025)
by: Zhang, Bingqing, et al.
Published: (2025)
Similar Items
-
DiffYOLO: Object Detection for Anti-Noise via YOLO and Diffusion Models
by: Liu, Yichen, et al.
Published: (2024) -
Unsupervised Anomaly Detection Using Diffusion Trend Analysis for Display Inspection
by: Kim, Eunwoo, et al.
Published: (2024) -
Synthetic Industrial Object Detection: GenAI vs. Feature-Based Methods
by: Araya-Martinez, Jose Moises, et al.
Published: (2025) -
UrbanAlign: Post-hoc Semantic Calibration for VLM-Human Preference Alignment
by: Zhang, Yecheng, et al.
Published: (2026) -
Revisiting Energy-Based Model for Out-of-Distribution Detection
by: Wu, Yifan, et al.
Published: (2024)