Saved in:
| Main Authors: | Yang, Baoyao, Chen, Junxiang, Li, Wanyun, Yao, Wenbin, Zhou, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.02885 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
VideoMind: An Omni-Modal Video Dataset with Intent Grounding for Deep-Cognitive Video Understanding
by: Yang, Baoyao, et al.
Published: (2025)
by: Yang, Baoyao, et al.
Published: (2025)
Leveraging OpenFlamingo for Multimodal Embedding Analysis of C2C Car Parts Data
by: Rashid, Maisha Binte, et al.
Published: (2025)
by: Rashid, Maisha Binte, et al.
Published: (2025)
Evaluating Perspectival Biases in Cross-Modal Retrieval
by: Saengsukhiran, Teerapol, et al.
Published: (2025)
by: Saengsukhiran, Teerapol, et al.
Published: (2025)
A Grounded Memory System For Smart Personal Assistants
by: Ocker, Felix, et al.
Published: (2025)
by: Ocker, Felix, et al.
Published: (2025)
TriAlignGR: Triangular Multitask Alignment with Multimodal Deep Interest Mining for Generative Recommendation
by: Zeng, Yangchen, et al.
Published: (2026)
by: Zeng, Yangchen, et al.
Published: (2026)
Large Language Model for Qualitative Research -- A Systematic Mapping Study
by: Barros, Cauã Ferreira, et al.
Published: (2024)
by: Barros, Cauã Ferreira, et al.
Published: (2024)
Semantic Reconstruction of Adversarial Plagiarism: A Context-Aware Framework for Detecting and Restoring "Tortured Phrases" in Scientific Literature
by: Maiti, Agniva, et al.
Published: (2025)
by: Maiti, Agniva, et al.
Published: (2025)
DEUCE: Dual-diversity Enhancement and Uncertainty-awareness for Cold-start Active Learning
by: Guo, Jiaxin, et al.
Published: (2025)
by: Guo, Jiaxin, et al.
Published: (2025)
An Ensemble Embedding Approach for Improving Semantic Caching Performance in LLM-based Systems
by: Ghaffari, Shervin, et al.
Published: (2025)
by: Ghaffari, Shervin, et al.
Published: (2025)
ARTAI: An Evaluation Platform to Assess Societal Risk of Recommender Algorithms
by: Ruan, Qin, et al.
Published: (2024)
by: Ruan, Qin, et al.
Published: (2024)
AI vs. Human Moderators: A Comparative Evaluation of Multimodal LLMs in Content Moderation for Brand Safety
by: Levi, Adi, et al.
Published: (2025)
by: Levi, Adi, et al.
Published: (2025)
Tensor Manifold-Based Graph-Vector Fusion for AI-Native Academic Literature Retrieval
by: Wei, Xing, et al.
Published: (2026)
by: Wei, Xing, et al.
Published: (2026)
Visualizing the Evolution of Twitter (X.com) Conversations: A Comprehensive Methodology Applied to AI Training Discussions on ChatGPT
by: Jess, Nicole, et al.
Published: (2024)
by: Jess, Nicole, et al.
Published: (2024)
Exploring Diagnostic Prompting Approach for Multimodal LLM-based Visual Complexity Assessment: A Case Study of Amazon Search Result Pages
by: Murtadak, Divendar, et al.
Published: (2025)
by: Murtadak, Divendar, et al.
Published: (2025)
QoSGMAA: A Robust Multi-Order Graph Attention and Adversarial Framework for Sparse QoS Prediction
by: Du, Guanchen, et al.
Published: (2025)
by: Du, Guanchen, et al.
Published: (2025)
Spatially-Grounded Document Retrieval via Patch-to-Region Relevance Propagation
by: Georgiou, Athos
Published: (2025)
by: Georgiou, Athos
Published: (2025)
Predicting When to Trust Vision-Language Models for Spatial Reasoning
by: Imran, Muhammad, et al.
Published: (2026)
by: Imran, Muhammad, et al.
Published: (2026)
ReCoVR: Closing the Loop in Interactive Composed Video Retrieval
by: Zhang, Bingqing, et al.
Published: (2026)
by: Zhang, Bingqing, et al.
Published: (2026)
FinBERT-QA: Financial Question Answering with pre-trained BERT Language Models
by: Yuan, Bithiah
Published: (2025)
by: Yuan, Bithiah
Published: (2025)
MoXaRt: Audio-Visual Object-Guided Sound Interaction for XR
by: Xu, Tianyu, et al.
Published: (2026)
by: Xu, Tianyu, et al.
Published: (2026)
LinkedOut: Linking World Knowledge Representation Out of Video LLM for Next-Generation Video Recommendation
by: Zhang, Haichao, et al.
Published: (2025)
by: Zhang, Haichao, et al.
Published: (2025)
AVATAAR: Agentic Video Answering via Temporal Adaptive Alignment and Reasoning
by: Patel, Urjitkumar, et al.
Published: (2025)
by: Patel, Urjitkumar, et al.
Published: (2025)
From Citation Selection to Citation Absorption: A Measurement Framework for Generative Engine Optimization Across AI Search Platforms
by: Kai, Zhang, et al.
Published: (2026)
by: Kai, Zhang, et al.
Published: (2026)
Dense Video Understanding with Gated Residual Tokenization
by: Zhang, Haichao, et al.
Published: (2025)
by: Zhang, Haichao, et al.
Published: (2025)
DALL-M: Context-Aware Clinical Data Augmentation with LLMs
by: Hsieh, Chihcheng, et al.
Published: (2024)
by: Hsieh, Chihcheng, et al.
Published: (2024)
Bottleneck-based Encoder-decoder ARchitecture (BEAR) for Learning Unbiased Consumer-to-Consumer Image Representations
by: Rivas, Pablo, et al.
Published: (2024)
by: Rivas, Pablo, et al.
Published: (2024)
Higher education assessment practice in the era of generative AI tools
by: Ogunleye, Bayode, et al.
Published: (2024)
by: Ogunleye, Bayode, et al.
Published: (2024)
Enhancing XR Auditory Realism via Multimodal Scene-Aware Acoustic Rendering
by: Xu, Tianyu, et al.
Published: (2025)
by: Xu, Tianyu, et al.
Published: (2025)
VOGUE: A Multimodal Dataset for Conversational Recommendation in Fashion
by: Guo, David, et al.
Published: (2025)
by: Guo, David, et al.
Published: (2025)
Correspondence of high-dimensional emotion structures elicited by video clips between humans and Multimodal LLMs
by: Asanuma, Haruka, et al.
Published: (2025)
by: Asanuma, Haruka, et al.
Published: (2025)
Experimentation Accelerator: Interpretable Insights and Creative Recommendations for A/B Testing with Content-Aware ranking
by: Hu, Zhengmian, et al.
Published: (2026)
by: Hu, Zhengmian, et al.
Published: (2026)
CourseTimeQA: A Lecture-Video Benchmark and a Latency-Constrained Cross-Modal Fusion Method for Timestamped QA
by: Kovalev, Vsevolod, et al.
Published: (2025)
by: Kovalev, Vsevolod, et al.
Published: (2025)
Real-World En Call Center Transcripts Dataset with PII Redaction
by: Dao, Ha, et al.
Published: (2025)
by: Dao, Ha, et al.
Published: (2025)
ORPHEAS: A Cross-Lingual Greek-English Embedding Model for Retrieval-Augmented Generation
by: Livieris, Ioannis E., et al.
Published: (2026)
by: Livieris, Ioannis E., et al.
Published: (2026)
HySemRAG: A Hybrid Semantic Retrieval-Augmented Generation Framework for Automated Literature Synthesis and Methodological Gap Analysis
by: Godinez, Alejandro
Published: (2025)
by: Godinez, Alejandro
Published: (2025)
Incorporating Legal Structure in Retrieval-Augmented Generation: A Case Study on Copyright Fair Use
by: Ho, Justin, et al.
Published: (2025)
by: Ho, Justin, et al.
Published: (2025)
To Retrieve or Not to Retrieve? Uncertainty Detection for Dynamic Retrieval Augmented Generation
by: Dhole, Kaustubh D.
Published: (2025)
by: Dhole, Kaustubh D.
Published: (2025)
LLMLogAnalyzer: A Clustering-Based Log Analysis Chatbot using Large Language Models
by: Cai, Peng, et al.
Published: (2025)
by: Cai, Peng, et al.
Published: (2025)
Quantifying and Narrowing the Unknown: Interactive Text-to-Video Retrieval via Uncertainty Minimization
by: Zhang, Bingqing, et al.
Published: (2025)
by: Zhang, Bingqing, et al.
Published: (2025)
LiftAvatar: Kinematic-Space Completion for Expression-Controlled 3D Gaussian Avatar Animation
by: Wei, Hualiang, et al.
Published: (2026)
by: Wei, Hualiang, et al.
Published: (2026)
Similar Items
-
VideoMind: An Omni-Modal Video Dataset with Intent Grounding for Deep-Cognitive Video Understanding
by: Yang, Baoyao, et al.
Published: (2025) -
Leveraging OpenFlamingo for Multimodal Embedding Analysis of C2C Car Parts Data
by: Rashid, Maisha Binte, et al.
Published: (2025) -
Evaluating Perspectival Biases in Cross-Modal Retrieval
by: Saengsukhiran, Teerapol, et al.
Published: (2025) -
A Grounded Memory System For Smart Personal Assistants
by: Ocker, Felix, et al.
Published: (2025) -
TriAlignGR: Triangular Multitask Alignment with Multimodal Deep Interest Mining for Generative Recommendation
by: Zeng, Yangchen, et al.
Published: (2026)