Saved in:
| Main Authors: | Yu, Yating, Cao, Congqi, Wang, Zhaoying, Meng, Weihua, Li, Jie, Li, Yuxin, Wei, Zihao, Shen, Zhongpei, Zhang, Jiajun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.00613 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Learning to Generalize without Bias for Open-Vocabulary Action Recognition
by: Yu, Yating, et al.
Published: (2025)
by: Yu, Yating, et al.
Published: (2025)
Vision and Intention Boost Large Language Model in Long-Term Action Anticipation
by: Cao, Congqi, et al.
Published: (2025)
by: Cao, Congqi, et al.
Published: (2025)
Prototypical Learning Guided Context-Aware Segmentation Network for Few-Shot Anomaly Detection
by: Jiang, Yuxin, et al.
Published: (2025)
by: Jiang, Yuxin, et al.
Published: (2025)
SRVAU-R1: Enhancing Video Anomaly Understanding via Reflection-Aware Learning
by: Zhao, Zihao, et al.
Published: (2026)
by: Zhao, Zihao, et al.
Published: (2026)
Autoregressive Denoising Score Matching is a Good Video Anomaly Detector
by: Zhang, Hanwen, et al.
Published: (2025)
by: Zhang, Hanwen, et al.
Published: (2025)
SAW-Bench: Learning Situated Awareness in the Real World
by: Li, Chuhan, et al.
Published: (2026)
by: Li, Chuhan, et al.
Published: (2026)
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
by: Li, Yifei, et al.
Published: (2025)
by: Li, Yifei, et al.
Published: (2025)
Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP
by: Yu, Yating, et al.
Published: (2024)
by: Yu, Yating, et al.
Published: (2024)
Task-Adapter: Task-specific Adaptation of Image Models for Few-shot Action Recognition
by: Cao, Congqi, et al.
Published: (2024)
by: Cao, Congqi, et al.
Published: (2024)
UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?
by: Wen, Zimo, et al.
Published: (2026)
by: Wen, Zimo, et al.
Published: (2026)
Multi-View Reconstruction with Global Context for 3D Anomaly Detection
by: Sun, Yihan, et al.
Published: (2025)
by: Sun, Yihan, et al.
Published: (2025)
Advancing Adaptive Multi-Stage Video Anomaly Reasoning: A Benchmark Dataset and Method
by: Huang, Chao, et al.
Published: (2026)
by: Huang, Chao, et al.
Published: (2026)
InkStream: Real-time GNN Inference on Streaming Graphs via Incremental Update
by: Wu, Dan, et al.
Published: (2023)
by: Wu, Dan, et al.
Published: (2023)
Task-Adapter++: Task-specific Adaptation with Order-aware Alignment for Few-shot Action Recognition
by: Cao, Congqi, et al.
Published: (2025)
by: Cao, Congqi, et al.
Published: (2025)
ESOM: Efficiently Understanding Streaming Video Anomalies with Open-world Dynamic Definitions
by: Liu, Zihao, et al.
Published: (2026)
by: Liu, Zihao, et al.
Published: (2026)
AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts
by: Li, Keyu, et al.
Published: (2026)
by: Li, Keyu, et al.
Published: (2026)
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
by: Lin, Junming, et al.
Published: (2024)
by: Lin, Junming, et al.
Published: (2024)
No Need For Real Anomaly: MLLM Empowered Zero-Shot Video Anomaly Detection
by: Dai, Zunkai, et al.
Published: (2026)
by: Dai, Zunkai, et al.
Published: (2026)
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development
by: Yang, Jie, et al.
Published: (2026)
by: Yang, Jie, et al.
Published: (2026)
AD-Bench: A Real-World, Trajectory-Aware Advertising Analytics Benchmark for LLM Agents
by: Hu, Lingxiang, et al.
Published: (2026)
by: Hu, Lingxiang, et al.
Published: (2026)
DentalBench: Benchmarking and Advancing LLMs Capability for Bilingual Dentistry Understanding
by: Zhu, Hengchuan, et al.
Published: (2025)
by: Zhu, Hengchuan, et al.
Published: (2025)
Towards Real-World HDR Video Reconstruction: A Large-Scale Benchmark Dataset and A Two-Stage Alignment Network
by: Shu, Yong, et al.
Published: (2024)
by: Shu, Yong, et al.
Published: (2024)
Towards Aerial Collaborative Stereo: Real-Time Cross-Camera Feature Association and Relative Pose Estimation for UAVs
by: Wang, Zhaoying, et al.
Published: (2024)
by: Wang, Zhaoying, et al.
Published: (2024)
VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning
by: Zhu, Liyun, et al.
Published: (2025)
by: Zhu, Liyun, et al.
Published: (2025)
FinMCP-Bench: Benchmarking LLM Agents for Real-World Financial Tool Use under the Model Context Protocol
by: Zhu, Jie, et al.
Published: (2026)
by: Zhu, Jie, et al.
Published: (2026)
ConsintBench: Evaluating Language Models on Real-World Consumer Intent Understanding
by: Li, Xiaozhe, et al.
Published: (2025)
by: Li, Xiaozhe, et al.
Published: (2025)
Retina‐Like Neuromorphic Visual Sensor for Sensing Broad‐Spectrum Ultraviolet Light (Advanced Optical Materials 35/2024)
by: Zhaoying Xi, et al.
Published: (2024)
by: Zhaoying Xi, et al.
Published: (2024)
CauCLIP: Bridging the Sim-to-Real Gap in Surgical Video Understanding via Causality-Inspired Vision-Language Modeling
by: He, Yuxin, et al.
Published: (2026)
by: He, Yuxin, et al.
Published: (2026)
Hawk: Learning to Understand Open-World Video Anomalies
by: Tang, Jiaqi, et al.
Published: (2024)
by: Tang, Jiaqi, et al.
Published: (2024)
RiskCueBench: Benchmarking Anticipatory Reasoning from Early Risk Cues in Video-Language Models
by: Luo, Sha, et al.
Published: (2026)
by: Luo, Sha, et al.
Published: (2026)
VEU-Bench: Towards Comprehensive Understanding of Video Editing
by: Li, Bozheng, et al.
Published: (2025)
by: Li, Bozheng, et al.
Published: (2025)
ComBench: A Repo-level Real-world Benchmark for Compilation Error Repair
by: Li, Jia, et al.
Published: (2026)
by: Li, Jia, et al.
Published: (2026)
Can LLMs Understand Time Series Anomalies?
by: Zhou, Zihao, et al.
Published: (2024)
by: Zhou, Zihao, et al.
Published: (2024)
Can Unified Generation and Understanding Models Maintain Semantic Equivalence Across Different Output Modalities?
by: Jiang, Hongbo, et al.
Published: (2026)
by: Jiang, Hongbo, et al.
Published: (2026)
Context-Aware Probabilistic Modeling with LLM for Multimodal Time Series Forecasting
by: Yao, Yueyang, et al.
Published: (2025)
by: Yao, Yueyang, et al.
Published: (2025)
Rethinking Metrics and Benchmarks of Video Anomaly Detection
by: Liu, Zihao, et al.
Published: (2025)
by: Liu, Zihao, et al.
Published: (2025)
WorldModelBench: Judging Video Generation Models As World Models
by: Li, Dacheng, et al.
Published: (2025)
by: Li, Dacheng, et al.
Published: (2025)
ProAgentBench: Evaluating LLM Agents for Proactive Assistance with Real-World Data
by: Tang, Yuanbo, et al.
Published: (2026)
by: Tang, Yuanbo, et al.
Published: (2026)
RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video
by: Xun, Shuhang, et al.
Published: (2025)
by: Xun, Shuhang, et al.
Published: (2025)
DecompileBench: A Comprehensive Benchmark for Evaluating Decompilers in Real-World Scenarios
by: Gao, Zeyu, et al.
Published: (2025)
by: Gao, Zeyu, et al.
Published: (2025)
Similar Items
-
Learning to Generalize without Bias for Open-Vocabulary Action Recognition
by: Yu, Yating, et al.
Published: (2025) -
Vision and Intention Boost Large Language Model in Long-Term Action Anticipation
by: Cao, Congqi, et al.
Published: (2025) -
Prototypical Learning Guided Context-Aware Segmentation Network for Few-Shot Anomaly Detection
by: Jiang, Yuxin, et al.
Published: (2025) -
SRVAU-R1: Enhancing Video Anomaly Understanding via Reflection-Aware Learning
by: Zhao, Zihao, et al.
Published: (2026) -
Autoregressive Denoising Score Matching is a Good Video Anomaly Detector
by: Zhang, Hanwen, et al.
Published: (2025)