Saved in:
| Main Authors: | Mishra, Suyash, Li, Qiang, Patil, Srikanth, Pati, Satyanarayan, Narendra, Baddu |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.04891 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Finder: A Multimodal AI-Powered Search Framework for Pharmaceutical Data Retrieval
by: Mishra, Suyash, et al.
Published: (2026)
by: Mishra, Suyash, et al.
Published: (2026)
From Understanding to Engagement: Personalized pharmacy Video Clips via Vision Language Models (VLMs)
by: Mishra, Suyash, et al.
Published: (2026)
by: Mishra, Suyash, et al.
Published: (2026)
GenAI Arena: An Open Evaluation Platform for Generative Models
by: Jiang, Dongfu, et al.
Published: (2024)
by: Jiang, Dongfu, et al.
Published: (2024)
Face Consistency Benchmark for GenAI Video
by: Podstawski, Michal, et al.
Published: (2025)
by: Podstawski, Michal, et al.
Published: (2025)
Can Large Vision-Language Models Detect Images Copyright Infringement from GenAI?
by: Xu, Qipan, et al.
Published: (2025)
by: Xu, Qipan, et al.
Published: (2025)
Synthetic Industrial Object Detection: GenAI vs. Feature-Based Methods
by: Araya-Martinez, Jose Moises, et al.
Published: (2025)
by: Araya-Martinez, Jose Moises, et al.
Published: (2025)
Video Active Perception: Effective Inference-Time Long-Form Video Understanding with Vision-Language Models
by: Ma, Martin Q., et al.
Published: (2026)
by: Ma, Martin Q., et al.
Published: (2026)
Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding
by: Shu, Yan, et al.
Published: (2024)
by: Shu, Yan, et al.
Published: (2024)
Enhancing weed detection performance by means of GenAI-based image augmentation
by: Modak, Sourav, et al.
Published: (2024)
by: Modak, Sourav, et al.
Published: (2024)
Exploring Model Quantization in GenAI-based Image Inpainting and Detection of Arable Plants
by: Modak, Sourav, et al.
Published: (2025)
by: Modak, Sourav, et al.
Published: (2025)
Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition
by: Hu, Xiaodan, et al.
Published: (2025)
by: Hu, Xiaodan, et al.
Published: (2025)
GenAI-DrawIO-Creator: A Framework for Automated Diagram Generation
by: Yu, Jinze, et al.
Published: (2026)
by: Yu, Jinze, et al.
Published: (2026)
City Navigation in the Wild: Exploring Emergent Navigation from Web-Scale Knowledge in MLLMs
by: Dalal, Dwip, et al.
Published: (2025)
by: Dalal, Dwip, et al.
Published: (2025)
DiffAtlas: GenAI-fying Atlas Segmentation via Image-Mask Diffusion
by: Zhang, Hantao, et al.
Published: (2025)
by: Zhang, Hantao, et al.
Published: (2025)
GenAI Mirage: The Impostor Bias and the Deepfake Detection Challenge in the Era of Artificial Illusions
by: Casu, Mirko, et al.
Published: (2023)
by: Casu, Mirko, et al.
Published: (2023)
Towards Temporal Compositional Reasoning in Long-Form Sports Videos
by: Cao, Siyu, et al.
Published: (2026)
by: Cao, Siyu, et al.
Published: (2026)
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark
by: Chen, Haoxing, et al.
Published: (2024)
by: Chen, Haoxing, et al.
Published: (2024)
GenAI Confessions: Black-box Membership Inference for Generative Image Models
by: Bohacek, Matyas, et al.
Published: (2025)
by: Bohacek, Matyas, et al.
Published: (2025)
AI-Generated Lecture Slides for Improving Slide Element Detection and Retrieval
by: Maniyar, Suyash, et al.
Published: (2025)
by: Maniyar, Suyash, et al.
Published: (2025)
LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization
by: Huang, Zhenpeng, et al.
Published: (2026)
by: Huang, Zhenpeng, et al.
Published: (2026)
GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation
by: Li, Baiqi, et al.
Published: (2024)
by: Li, Baiqi, et al.
Published: (2024)
CollabVR: Collaborative Video Reasoning with Vision-Language and Video Generation Models
by: Kim, Joowon, et al.
Published: (2026)
by: Kim, Joowon, et al.
Published: (2026)
VideoARM: Agentic Reasoning over Hierarchical Memory for Long-Form Video Understanding
by: Yin, Yufei, et al.
Published: (2025)
by: Yin, Yufei, et al.
Published: (2025)
Dual-Signal Adaptive KV-Cache Optimization for Long-Form Video Understanding in Vision-Language Models
by: Sai, Vishnu, et al.
Published: (2026)
by: Sai, Vishnu, et al.
Published: (2026)
Drifting Away from Truth: GenAI-Driven News Diversity Challenges LVLM-Based Misinformation Detection
by: Li, Fanxiao, et al.
Published: (2025)
by: Li, Fanxiao, et al.
Published: (2025)
Dimension vs. Precision: A Comparative Analysis of Autoencoders and Quantization for Efficient Vector Retrieval on BEIR SciFact
by: Pati, Satyanarayan
Published: (2025)
by: Pati, Satyanarayan
Published: (2025)
Self-Consistent Latent Reasoning: Long Latent Sequence Reasoning for Vision-Language Model
by: Wang, Chenfeng, et al.
Published: (2026)
by: Wang, Chenfeng, et al.
Published: (2026)
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
by: Chen, Yukang, et al.
Published: (2024)
by: Chen, Yukang, et al.
Published: (2024)
Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method
by: Song, Xinshuai, et al.
Published: (2024)
by: Song, Xinshuai, et al.
Published: (2024)
Not All Similarities Are Created Equal: Leveraging Data-Driven Biases to Inform GenAI Copyright Disputes
by: Hacohen, Uri, et al.
Published: (2024)
by: Hacohen, Uri, et al.
Published: (2024)
GAC-KAN: An Ultra-Lightweight GNSS Interference Classifier for GenAI-Powered Consumer Edge Devices
by: Zeng, Zhihan, et al.
Published: (2026)
by: Zeng, Zhihan, et al.
Published: (2026)
Mitigating GenAI-powered Evidence Pollution for Out-of-Context Multimodal Misinformation Detection
by: Yan, Zehong, et al.
Published: (2025)
by: Yan, Zehong, et al.
Published: (2025)
Infusing Environmental Captions for Long-Form Video Language Grounding
by: Lee, Hyogun, et al.
Published: (2024)
by: Lee, Hyogun, et al.
Published: (2024)
MPCAR: Multi-Perspective Contextual Augmentation for Enhanced Visual Reasoning in Large Vision-Language Models
by: Rahman, Amirul, et al.
Published: (2025)
by: Rahman, Amirul, et al.
Published: (2025)
NeuS-QA: Grounding Long-Form Video Understanding in Temporal Logic and Neuro-Symbolic Reasoning
by: Shah, Sahil, et al.
Published: (2025)
by: Shah, Sahil, et al.
Published: (2025)
REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding
by: Li, Jiaze, et al.
Published: (2025)
by: Li, Jiaze, et al.
Published: (2025)
Goldfish: Vision-Language Understanding of Arbitrarily Long Videos
by: Ataallah, Kirolos, et al.
Published: (2024)
by: Ataallah, Kirolos, et al.
Published: (2024)
Personalization as a Game: Equilibrium-Guided Generative Modeling for Physician Behavior in Pharmaceutical Engagement
by: Mishra, Suyash
Published: (2026)
by: Mishra, Suyash
Published: (2026)
Are Pose Estimators Ready for the Open World? STAGE: A GenAI Toolkit for Auditing 3D Human Pose Estimators
by: Kister, Nikita, et al.
Published: (2024)
by: Kister, Nikita, et al.
Published: (2024)
Temporal Contrastive Learning for Video Temporal Reasoning in Large Vision-Language Models
by: Souza, Rafael, et al.
Published: (2024)
by: Souza, Rafael, et al.
Published: (2024)
Similar Items
-
Finder: A Multimodal AI-Powered Search Framework for Pharmaceutical Data Retrieval
by: Mishra, Suyash, et al.
Published: (2026) -
From Understanding to Engagement: Personalized pharmacy Video Clips via Vision Language Models (VLMs)
by: Mishra, Suyash, et al.
Published: (2026) -
GenAI Arena: An Open Evaluation Platform for Generative Models
by: Jiang, Dongfu, et al.
Published: (2024) -
Face Consistency Benchmark for GenAI Video
by: Podstawski, Michal, et al.
Published: (2025) -
Can Large Vision-Language Models Detect Images Copyright Infringement from GenAI?
by: Xu, Qipan, et al.
Published: (2025)