Spremljeno u:
| Glavni autori: | Liu, Shaoyu, Li, Jianing, Zhao, Guanghui, Zhang, Yunjian, Jiang, Wen, Li, Ming, Ji, Xiangyang |
|---|---|
| Format: | Preprint |
| Izdano: |
2026
|
| Teme: | |
| Online pristup: | https://arxiv.org/abs/2602.03230 |
| Oznake: |
Dodaj oznaku
Bez oznaka, Budi prvi tko označuje ovaj zapis!
|
Slični predmeti
EventBench: Towards Comprehensive Benchmarking of Event-based MLLMs
od: Liu, Shaoyu, i dr.
Izdano: (2025)
od: Liu, Shaoyu, i dr.
Izdano: (2025)
EventGPT: Event Stream Understanding with Multimodal Large Language Models
od: Liu, Shaoyu, i dr.
Izdano: (2024)
od: Liu, Shaoyu, i dr.
Izdano: (2024)
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
od: Wang, Sudong, i dr.
Izdano: (2025)
od: Wang, Sudong, i dr.
Izdano: (2025)
FlashCap: Millisecond-Accurate Human Motion Capture via Flashing LEDs and Event-Based Vision
od: Wu, Zekai, i dr.
Izdano: (2026)
od: Wu, Zekai, i dr.
Izdano: (2026)
Learning to Remove Lens Flare in Event Camera
od: Han, Haiqian, i dr.
Izdano: (2025)
od: Han, Haiqian, i dr.
Izdano: (2025)
LongFly: Long-Horizon UAV Vision-and-Language Navigation with Spatiotemporal Context Integration
od: Jiang, Wen, i dr.
Izdano: (2025)
od: Jiang, Wen, i dr.
Izdano: (2025)
Memory Helps, but Confabulation Misleads: Understanding Streaming Events in Videos with MLLMs
od: Zhang, Gengyuan, i dr.
Izdano: (2025)
od: Zhang, Gengyuan, i dr.
Izdano: (2025)
EventZoom: A Progressive Approach to Event-Based Data Augmentation for Enhanced Neuromorphic Vision
od: Dong, Yiting, i dr.
Izdano: (2024)
od: Dong, Yiting, i dr.
Izdano: (2024)
Moment-Video: Diagnosing Temporal Fidelity of Video MLLMs on Momentary Visual Events
od: Liu, Xiaolin, i dr.
Izdano: (2026)
od: Liu, Xiaolin, i dr.
Izdano: (2026)
Towards Event-oriented Long Video Understanding
od: Du, Yifan, i dr.
Izdano: (2024)
od: Du, Yifan, i dr.
Izdano: (2024)
SpatialFly: Geometry-Guided Representation Alignment for UAV Vision-and-Language Navigation in Urban Environments
od: Jiang, Wen, i dr.
Izdano: (2026)
od: Jiang, Wen, i dr.
Izdano: (2026)
ExtraVAR: Stage-Aware RoPE Remapping for Resolution Extrapolation in Visual Autoregressive Models
od: Yan, Feihong, i dr.
Izdano: (2026)
od: Yan, Feihong, i dr.
Izdano: (2026)
Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios
od: Li, Chunxiao, i dr.
Izdano: (2025)
od: Li, Chunxiao, i dr.
Izdano: (2025)
Adaptive Event Stream Slicing for Open-Vocabulary Event-Based Object Detection via Vision-Language Knowledge Distillation
od: Zhang, Jinchang, i dr.
Izdano: (2025)
od: Zhang, Jinchang, i dr.
Izdano: (2025)
HDI-Former: Hybrid Dynamic Interaction ANN-SNN Transformer for Object Detection Using Frames and Events
od: Li, Dianze, i dr.
Izdano: (2024)
od: Li, Dianze, i dr.
Izdano: (2024)
Zebrafish Counting Using Event Stream Data
od: Chen, Qianghua, i dr.
Izdano: (2025)
od: Chen, Qianghua, i dr.
Izdano: (2025)
Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
od: Su, Yongyi, i dr.
Izdano: (2025)
od: Su, Yongyi, i dr.
Izdano: (2025)
Event-Priori-Based Vision-Language Model for Efficient Visual Understanding
od: Qin, Haotong, i dr.
Izdano: (2025)
od: Qin, Haotong, i dr.
Izdano: (2025)
RS-OOD: A Vision-Language Augmented Framework for Out-of-Distribution Detection in Remote Sensing
od: Wang, Chenhao, i dr.
Izdano: (2025)
od: Wang, Chenhao, i dr.
Izdano: (2025)
CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications
od: Zhang, Tianfang, i dr.
Izdano: (2024)
od: Zhang, Tianfang, i dr.
Izdano: (2024)
Efficient Event-Based Semantic Segmentation via Exploiting Frame-Event Fusion: A Hybrid Neural Network Approach
od: Li, Hebei, i dr.
Izdano: (2025)
od: Li, Hebei, i dr.
Izdano: (2025)
VLDrive: Vision-Augmented Lightweight MLLMs for Efficient Language-grounded Autonomous Driving
od: Zhang, Ruifei, i dr.
Izdano: (2025)
od: Zhang, Ruifei, i dr.
Izdano: (2025)
EventGait: Towards Robust Gait Recognition with Event Streams
od: Xu, Senyan, i dr.
Izdano: (2026)
od: Xu, Senyan, i dr.
Izdano: (2026)
EventHallusion: Diagnosing Event Hallucinations in Video LLMs
od: Zhang, Jiacheng, i dr.
Izdano: (2024)
od: Zhang, Jiacheng, i dr.
Izdano: (2024)
Event Transformer
od: Jiang, Bin, i dr.
Izdano: (2022)
od: Jiang, Bin, i dr.
Izdano: (2022)
Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions
od: Li, Yunheng, i dr.
Izdano: (2026)
od: Li, Yunheng, i dr.
Izdano: (2026)
Magic-MM-Embedding: Towards Visual-Token-Efficient Universal Multimodal Embedding with MLLMs
od: Li, Qi, i dr.
Izdano: (2026)
od: Li, Qi, i dr.
Izdano: (2026)
Research, Applications and Prospects of Event-Based Pedestrian Detection: A Survey
od: Wang, Han, i dr.
Izdano: (2024)
od: Wang, Han, i dr.
Izdano: (2024)
Self-supervised Event-based Monocular Depth Estimation using Cross-modal Consistency
od: Zhu, Junyu, i dr.
Izdano: (2024)
od: Zhu, Junyu, i dr.
Izdano: (2024)
FreqTrack: Frequency Learning based Vision Transformer for RGB-Event Object Tracking
od: You, Jinlin, i dr.
Izdano: (2026)
od: You, Jinlin, i dr.
Izdano: (2026)
Temporal-Guided Visual Foundation Models for Event-Based Vision
od: Xia, Ruihao, i dr.
Izdano: (2025)
od: Xia, Ruihao, i dr.
Izdano: (2025)
Towards Camera-Robust 3D Localization: Equation-Anchored Tool-Use for MLLMs
od: Jiang, Xueying, i dr.
Izdano: (2026)
od: Jiang, Xueying, i dr.
Izdano: (2026)
Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation
od: Wang, Zihan, i dr.
Izdano: (2024)
od: Wang, Zihan, i dr.
Izdano: (2024)
evMLP: An Efficient Event-Driven MLP Architecture for Vision
od: Zheng, Zhentan
Izdano: (2025)
od: Zheng, Zhentan
Izdano: (2025)
Learning from Dense Events: Towards Fast Spiking Neural Networks Training via Event Dataset Distillation
od: Ye, Shuhan, i dr.
Izdano: (2025)
od: Ye, Shuhan, i dr.
Izdano: (2025)
Making MLLMs Blind: Adversarial Smuggling Attacks in MLLM Content Moderation
od: Li, Zhiheng, i dr.
Izdano: (2026)
od: Li, Zhiheng, i dr.
Izdano: (2026)
From Events to Enhancement: A Survey on Event-Based Imaging Technologies
od: Lu, Yunfan, i dr.
Izdano: (2025)
od: Lu, Yunfan, i dr.
Izdano: (2025)
AnyRefill: A Unified, Data-Efficient Framework for Left-Prompt-Guided Vision Tasks
od: Xie, Ming, i dr.
Izdano: (2025)
od: Xie, Ming, i dr.
Izdano: (2025)
EventSTU: Event-Guided Efficient Spatio-Temporal Understanding for Video Large Language Models
od: Xu, Wenhao, i dr.
Izdano: (2025)
od: Xu, Wenhao, i dr.
Izdano: (2025)
V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation
od: Lou, Hanyue, i dr.
Izdano: (2025)
od: Lou, Hanyue, i dr.
Izdano: (2025)
Slični predmeti
-
EventBench: Towards Comprehensive Benchmarking of Event-based MLLMs
od: Liu, Shaoyu, i dr.
Izdano: (2025) -
EventGPT: Event Stream Understanding with Multimodal Large Language Models
od: Liu, Shaoyu, i dr.
Izdano: (2024) -
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
od: Wang, Sudong, i dr.
Izdano: (2025) -
FlashCap: Millisecond-Accurate Human Motion Capture via Flashing LEDs and Event-Based Vision
od: Wu, Zekai, i dr.
Izdano: (2026) -
Learning to Remove Lens Flare in Event Camera
od: Han, Haiqian, i dr.
Izdano: (2025)