:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ghosh, Archishman, Roy, Abhinaba, Herremans, Dorien
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence 68T01 I.2.6; I.2.10; H.3.3
Online Access:	https://arxiv.org/abs/2605.08175
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

CourseTimeQA: A Lecture-Video Benchmark and a Latency-Constrained Cross-Modal Fusion Method for Timestamped QA
by: Kovalev, Vsevolod, et al.
Published: (2025)

ORPHEAS: A Cross-Lingual Greek-English Embedding Model for Retrieval-Augmented Generation
by: Livieris, Ioannis E., et al.
Published: (2026)

APEX: Large-scale Multi-task Aesthetic-Informed Popularity Prediction for AI-Generated Music
by: Husain, Jaavid Aktar, et al.
Published: (2026)

EdgeJury: Cross-Reviewed Small-Model Ensembles for Truthful Question Answering on Serverless Edge Inference
by: Kumar, Aayush
Published: (2025)

Improving Graph Embeddings in Machine Learning Using Knowledge Completion with Validation in a Case Study on COVID-19 Spread
by: Napoli, Rosario, et al.
Published: (2025)

A Hybrid Multimodal Deep Learning Framework for Intelligent Fashion Recommendation
by: Kalashi, Kamand, et al.
Published: (2025)

LinkedOut: Linking World Knowledge Representation Out of Video LLM for Next-Generation Video Recommendation
by: Zhang, Haichao, et al.
Published: (2025)

Playing telephone with generative models: "verification disability," "compelled reliance," and accessibility in data visualization
by: Elavsky, Frank, et al.
Published: (2025)

TriAlignGR: Triangular Multitask Alignment with Multimodal Deep Interest Mining for Generative Recommendation
by: Zeng, Yangchen, et al.
Published: (2026)

EQUATOR: A Deterministic Framework for Evaluating LLM Reasoning with Open-Ended Questions. # v1.0.0-beta
by: Bernard, Raymond, et al.
Published: (2024)

AVATAAR: Agentic Video Answering via Temporal Adaptive Alignment and Reasoning
by: Patel, Urjitkumar, et al.
Published: (2025)

Quantifying and Narrowing the Unknown: Interactive Text-to-Video Retrieval via Uncertainty Minimization
by: Zhang, Bingqing, et al.
Published: (2025)

SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering
by: Melechovsky, Jan, et al.
Published: (2025)

Bottleneck-based Encoder-decoder ARchitecture (BEAR) for Learning Unbiased Consumer-to-Consumer Image Representations
by: Rivas, Pablo, et al.
Published: (2024)

Graph-PiT: Enhancing Structural Coherence in Part-Based Image Synthesis via Graph Priors
by: Zhang, Junbin, et al.
Published: (2026)

QoSGMAA: A Robust Multi-Order Graph Attention and Adversarial Framework for Sparse QoS Prediction
by: Du, Guanchen, et al.
Published: (2025)

Optimizing Retrieval-Augmented Generation (RAG) for Colloquial Cantonese: A LoRA-Based Systematic Review
by: Calonge, David Santandreu, et al.
Published: (2025)

ReCoVR: Closing the Loop in Interactive Composed Video Retrieval
by: Zhang, Bingqing, et al.
Published: (2026)

AIDOVECL: AI-generated Dataset of Outpainted Vehicles for Eye-level Classification and Localization
by: Kazemi, Amir, et al.
Published: (2024)

COBRA-PPM: A Causal Bayesian Reasoning Architecture Using Probabilistic Programming for Robot Manipulation Under Uncertainty
by: Cannizzaro, Ricardo, et al.
Published: (2024)

Fact Grounded Attention: Eliminating Hallucination in Large Language Models Through Attention Level Knowledge Integration
by: Gupta, Aayush
Published: (2025)

A Hybrid Framework for Real-Time Data Drift and Anomaly Identification Using Hierarchical Temporal Memory and Statistical Tests
by: Bandyopadhyay, Subhadip, et al.
Published: (2025)

Dense Video Understanding with Gated Residual Tokenization
by: Zhang, Haichao, et al.
Published: (2025)

Inducing Causal World Models in LLMs for Zero-Shot Physical Reasoning
by: Sharma, Aditya, et al.
Published: (2025)

Attention Gathers, MLPs Compose: A Causal Analysis of an Action-Outcome Circuit in VideoViT
by: Chereddy, Sai V R
Published: (2026)

TexTile: A Differentiable Metric for Texture Tileability
by: Rodriguez-Pardo, Carlos, et al.
Published: (2024)

Robust Test-time Video-Text Retrieval: Benchmarking and Adapting for Query Shifts
by: Zhang, Bingqing, et al.
Published: (2026)

Classifier Calibration at Scale: An Empirical Study of Model-Agnostic Post-Hoc Methods
by: Manokhin, Valery, et al.
Published: (2026)

Combating data scarcity in recommendation services: Integrating cognitive types of VARK and neural network technologies (LLM)
by: Zmanovskii, Nikita
Published: (2026)

Task Memory Engine (TME): Enhancing State Awareness for Multi-Step LLM Agent Tasks
by: Ye, Ye
Published: (2025)

ExpReS-VLA: Specializing Vision-Language-Action Models Through Experience Replay and Retrieval
by: Syed, Shahram Najam, et al.
Published: (2025)

Tokenization Standards for Linguistic Integrity: Turkish as a Benchmark
by: Bayram, M. Ali, et al.
Published: (2025)

Prevailing Research Areas for Music AI in the Era of Foundation Models
by: Wei, Megan, et al.
Published: (2024)

Classification of Cattle Behavior and Detection of Heat (Estrus) using Sensor Data
by: Dhakshinamoorthy, Druva, et al.
Published: (2025)

Exploring specialization and sensitivity of convolutional neural networks in the context of simultaneous image augmentations
by: Kharyuk, Pavel, et al.
Published: (2025)

Platonic Representations in the Human Brain: Unsupervised Recovery of Universal Geometry
by: Marcos-Manchón, Pablo, et al.
Published: (2026)

Mubeen AI: A Specialized Arabic Language Model for Heritage Preservation and User Intent Understanding
by: Aljafari, Mohammed, et al.
Published: (2025)

Query-Aware Flow Diffusion for Graph-Based RAG with Retrieval Guarantees
by: Zhou, Zhuoping, et al.
Published: (2026)

Diffusion Models Generate Images Like Painters: an Analytical Theory of Outline First, Details Later
by: Wang, Binxu, et al.
Published: (2023)

Application of Sensitivity Analysis Methods for Studying Neural Network Models
by: Miao, Jiaxuan, et al.
Published: (2025)