Saved in:
| Main Authors: | Zhang, Jinrong, Wang, Penghui, Liu, Chunxiao, Liu, Wei, Jin, Dian, Zhang, Qiong, Meng, Erli, Hu, Zhengnan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.10719 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Empowering VLMs for Few-Shot Multimodal Time Series Classification via Tailored Agentic Reasoning
by: Li, Lin, et al.
Published: (2026)
by: Li, Lin, et al.
Published: (2026)
DeltaVLM: Interactive Remote Sensing Image Change Analysis via Instruction-guided Difference Perception
by: Deng, Pei, et al.
Published: (2025)
by: Deng, Pei, et al.
Published: (2025)
Human Activity Recognition in an Open World
by: Prijatelj, Derek S., et al.
Published: (2022)
by: Prijatelj, Derek S., et al.
Published: (2022)
AVadCLIP: Audio-Visual Collaboration for Robust Video Anomaly Detection
by: Wu, Peng, et al.
Published: (2025)
by: Wu, Peng, et al.
Published: (2025)
RCooper: A Real-world Large-scale Dataset for Roadside Cooperative Perception
by: Hao, Ruiyang, et al.
Published: (2024)
by: Hao, Ruiyang, et al.
Published: (2024)
RPCASSM: Robust PCA State Space Model For Infrared Small Target Detection
by: Liu, Pingping, et al.
Published: (2026)
by: Liu, Pingping, et al.
Published: (2026)
LRCP: Low-Rank Compressibility Guided Visual Token Pruning for Efficient LVLMs
by: Lu, Hongyu, et al.
Published: (2026)
by: Lu, Hongyu, et al.
Published: (2026)
FAME: Feature Activation Map Explanation on Image Classification and Face Recognition
by: Zhang, Xinyi, et al.
Published: (2026)
by: Zhang, Xinyi, et al.
Published: (2026)
Robust Multi-Source Covid-19 Detection in CT Images
by: Pritha, Asmita Yuki, et al.
Published: (2026)
by: Pritha, Asmita Yuki, et al.
Published: (2026)
See What You Need: Query-Aware Visual Intelligence through Reasoning-Perception Loops
by: Dong, Zixuan, et al.
Published: (2025)
by: Dong, Zixuan, et al.
Published: (2025)
PrePrompt: Predictive prompting for class incremental learning
by: Huang, Libo, et al.
Published: (2025)
by: Huang, Libo, et al.
Published: (2025)
FS-DAG: Few Shot Domain Adapting Graph Networks for Visually Rich Document Understanding
by: Agarwal, Amit, et al.
Published: (2025)
by: Agarwal, Amit, et al.
Published: (2025)
LongSumEval: Question-Answering Based Evaluation and Feedback-Driven Refinement for Long Document Summarization
by: Nguyen, Huyen, et al.
Published: (2026)
by: Nguyen, Huyen, et al.
Published: (2026)
Motion-Guided Semantic Alignment with Negative Prompts for Zero-Shot Video Action Recognition
by: Wang, Yiming, et al.
Published: (2026)
by: Wang, Yiming, et al.
Published: (2026)
PI-TTA: Physics-Informed Source-Free Test-Time Adaptation for Robust Human Activity Recognition on Mobile Devices
by: Li, Changyu, et al.
Published: (2026)
by: Li, Changyu, et al.
Published: (2026)
DragOSM: Extract Building Roofs and Footprints from Aerial Images by Aligning Historical Labels
by: Li, Kai, et al.
Published: (2025)
by: Li, Kai, et al.
Published: (2025)
Prompt Sensitivity in Vision-Language Grounding: How Small Changes in Wording Affect Object Detection
by: Deka, Dawar Jyoti, et al.
Published: (2026)
by: Deka, Dawar Jyoti, et al.
Published: (2026)
SoundPlot: An Open-Source Framework for Birdsong Acoustic Analysis and Neural Synthesis with Interactive 3D Visualization
by: Mehdi, Naqcho Ali, et al.
Published: (2026)
by: Mehdi, Naqcho Ali, et al.
Published: (2026)
Performance Evaluation of Transfer Learning Based Medical Image Classification Techniques for Disease Detection
by: Ahmad, Zeeshan, et al.
Published: (2025)
by: Ahmad, Zeeshan, et al.
Published: (2025)
CaLoRAify: Calorie Estimation with Visual-Text Pairing and LoRA-Driven Visual Language Models
by: Yao, Dongyu, et al.
Published: (2024)
by: Yao, Dongyu, et al.
Published: (2024)
Libra: Leveraging Temporal Images for Biomedical Radiology Analysis
by: Zhang, Xi, et al.
Published: (2024)
by: Zhang, Xi, et al.
Published: (2024)
Synthetic Image Detection with CLIP: Understanding and Assessing Predictive Cues
by: Willi, Marco, et al.
Published: (2026)
by: Willi, Marco, et al.
Published: (2026)
Learning from Semantic Dictionaries: Discriminative Codebook Contrastive Learning for Unified Visual Representation and Generation
by: Estepa, Imanol G., et al.
Published: (2026)
by: Estepa, Imanol G., et al.
Published: (2026)
Anomaly Detection of Particle Orbit in Accelerator using LSTM Deep Learning Technology
by: Chen, Zhiyuan, et al.
Published: (2024)
by: Chen, Zhiyuan, et al.
Published: (2024)
PipeMFL-240K: A Large-scale Dataset and Benchmark for Object Detection in Pipeline Magnetic Flux Leakage Imaging
by: Qu, Tianyi, et al.
Published: (2026)
by: Qu, Tianyi, et al.
Published: (2026)
Prototype Contrastive Consistency Learning for Semi-Supervised Medical Image Segmentation
by: He, Shihuan, et al.
Published: (2025)
by: He, Shihuan, et al.
Published: (2025)
Mobile-Ready Automated Triage of Diabetic Retinopathy Using Digital Fundus Images
by: Joshi, Aadi, et al.
Published: (2026)
by: Joshi, Aadi, et al.
Published: (2026)
LLM-Guided Exemplar Selection for Few-Shot Wearable-Sensor Human Activity Recognition
by: Ronando, Elsen, et al.
Published: (2025)
by: Ronando, Elsen, et al.
Published: (2025)
SAM Fewshot Finetuning for Anatomical Segmentation in Medical Images
by: Xie, Weiyi, et al.
Published: (2024)
by: Xie, Weiyi, et al.
Published: (2024)
OpenFake: An Open Dataset and Platform Toward Real-World Deepfake Detection
by: Livernoche, Victor, et al.
Published: (2025)
by: Livernoche, Victor, et al.
Published: (2025)
Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic Approach
by: Ke, Yueying
Published: (2025)
by: Ke, Yueying
Published: (2025)
OrganicHAR: Towards Activity Discovery in Organic Settings for Privacy Preserving Sensors Using Efficient Video Analysis
by: Patidar, Prasoon, et al.
Published: (2026)
by: Patidar, Prasoon, et al.
Published: (2026)
Image-Based Leopard Seal Recognition: Approaches and Challenges in Current Automated Systems
by: Salazar, Jorge Yero, et al.
Published: (2024)
by: Salazar, Jorge Yero, et al.
Published: (2024)
On Onboard LiDAR-based Flying Object Detection
by: Vrba, Matouš, et al.
Published: (2023)
by: Vrba, Matouš, et al.
Published: (2023)
Proximity QA: Unleashing the Power of Multi-Modal Large Language Models for Spatial Proximity Analysis
by: Li, Jianing, et al.
Published: (2024)
by: Li, Jianing, et al.
Published: (2024)
Leveraging GNSS and Onboard Visual Data from Consumer Vehicles for Robust Road Network Estimation
by: Opra, Balázs, et al.
Published: (2024)
by: Opra, Balázs, et al.
Published: (2024)
Advancing Visual Computing in Materials Science (Shonan Seminar 189)
by: Heinzl, Christoph, et al.
Published: (2024)
by: Heinzl, Christoph, et al.
Published: (2024)
In-Hospital Stroke Prediction from PPG-Derived Hemodynamic Features
by: Liu, Jiaming, et al.
Published: (2026)
by: Liu, Jiaming, et al.
Published: (2026)
Synthesizing EEG Signals from Event-Related Potential Paradigms with Conditional Diffusion Models
by: Klein, Guido, et al.
Published: (2024)
by: Klein, Guido, et al.
Published: (2024)
Modular Deep Learning Framework for Assistive Perception: Gaze, Affect, and Speaker Identification
by: Anchan, Akshit Pramod, et al.
Published: (2025)
by: Anchan, Akshit Pramod, et al.
Published: (2025)
Similar Items
-
Empowering VLMs for Few-Shot Multimodal Time Series Classification via Tailored Agentic Reasoning
by: Li, Lin, et al.
Published: (2026) -
DeltaVLM: Interactive Remote Sensing Image Change Analysis via Instruction-guided Difference Perception
by: Deng, Pei, et al.
Published: (2025) -
Human Activity Recognition in an Open World
by: Prijatelj, Derek S., et al.
Published: (2022) -
AVadCLIP: Audio-Visual Collaboration for Robust Video Anomaly Detection
by: Wu, Peng, et al.
Published: (2025) -
RCooper: A Real-world Large-scale Dataset for Roadside Cooperative Perception
by: Hao, Ruiyang, et al.
Published: (2024)