Saved in:
| Main Authors: | Wang, Yao, Liu, Xin, Liu, Zhuochen, Chen, Jiankang, Jatowt, Adam, Kim, Kyoungsook, Kando, Noriko, Yu, Haitao |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.17838 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
TEMPO: A Realistic Multi-Domain Benchmark for Temporal Reasoning-Intensive Retrieval
by: Abdallah, Abdelrahman, et al.
Published: (2026)
by: Abdallah, Abdelrahman, et al.
Published: (2026)
PARSE: An Open-Domain Reasoning Question Answering Benchmark for Persian
by: Mozafari, Jamshid, et al.
Published: (2026)
by: Mozafari, Jamshid, et al.
Published: (2026)
Wisdom of the Crowds in Forecasting: Forecast Summarization for Supporting Future Event Prediction
by: Saha, Anisha, et al.
Published: (2025)
by: Saha, Anisha, et al.
Published: (2025)
Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction
by: Nako, Petraq, et al.
Published: (2025)
by: Nako, Petraq, et al.
Published: (2025)
Generator-Retriever-Generator Approach for Open-Domain Question Answering
by: Abdallah, Abdelrahman, et al.
Published: (2023)
by: Abdallah, Abdelrahman, et al.
Published: (2023)
Exploring NLP Benchmarks in an Extremely Low-Resource Setting
by: Nuha, Ulin, et al.
Published: (2025)
by: Nuha, Ulin, et al.
Published: (2025)
A2Seek: Towards Reasoning-Centric Benchmark for Aerial Anomaly Understanding
by: Mo, Mengjingcheng, et al.
Published: (2025)
by: Mo, Mengjingcheng, et al.
Published: (2025)
Assessing the Effectiveness of LLMs in Delivering Cognitive Behavioral Therapy
by: Bedi, Navdeep Singh, et al.
Published: (2026)
by: Bedi, Navdeep Singh, et al.
Published: (2026)
Towards Effective Time-Aware Language Representation: Exploring Enhanced Temporal Understanding in Language Models
by: Wang, Jiexin, et al.
Published: (2024)
by: Wang, Jiexin, et al.
Published: (2024)
Are LLMs effective psychological assessors? Leveraging adaptive RAG for interpretable mental health screening through psychometric practice
by: Ravenda, Federico, et al.
Published: (2025)
by: Ravenda, Federico, et al.
Published: (2025)
WikiHint: A Human-Annotated Dataset for Hint Ranking and Generation
by: Mozafari, Jamshid, et al.
Published: (2024)
by: Mozafari, Jamshid, et al.
Published: (2024)
Multi-hop Question Answering
by: Mavi, Vaibhav, et al.
Published: (2022)
by: Mavi, Vaibhav, et al.
Published: (2022)
RECOR: Reasoning-focused Multi-turn Conversational Retrieval Benchmark
by: Ali, Mohammed, et al.
Published: (2026)
by: Ali, Mohammed, et al.
Published: (2026)
Adventures of a Shut-in Librarian
by: Odescalchi, Esther Kando
Published: (1974)
by: Odescalchi, Esther Kando
Published: (1974)
Exploring Hint Generation Approaches in Open-Domain Question Answering
by: Mozafari, Jamshid, et al.
Published: (2024)
by: Mozafari, Jamshid, et al.
Published: (2024)
REGREACT: Self-Correcting Multi-Agent Pipelines for Structured Regulatory Information Extraction
by: Ali, Mohammed, et al.
Published: (2026)
by: Ali, Mohammed, et al.
Published: (2026)
Evaluating List Construction and Temporal Understanding capabilities of Large Language Models
by: Dumitru, Alexandru, et al.
Published: (2025)
by: Dumitru, Alexandru, et al.
Published: (2025)
Analyzing the Role of Context in Forecasting with Large Language Models
by: Mutschlechner, Gerrit, et al.
Published: (2025)
by: Mutschlechner, Gerrit, et al.
Published: (2025)
Automated Analysis of Sustainability Reports: Using Large Language Models for the Extraction and Prediction of EU Taxonomy-Compliant KPIs
by: Schmoll, Jonathan, et al.
Published: (2025)
by: Schmoll, Jonathan, et al.
Published: (2025)
Guess the Age of Photos: An Interactive Web Platform for Historical Image Age Estimation
by: Yucedag, Hasan, et al.
Published: (2025)
by: Yucedag, Hasan, et al.
Published: (2025)
Temporal Validity Change Prediction
by: Wenzel, Georg, et al.
Published: (2024)
by: Wenzel, Georg, et al.
Published: (2024)
Transformers and Language Models in Form Understanding: A Comprehensive Review of Scanned Document Analysis
by: Abdallah, Abdelrahman, et al.
Published: (2024)
by: Abdallah, Abdelrahman, et al.
Published: (2024)
FactGuard: Event-Centric and Commonsense-Guided Fake News Detection
by: He, Jing, et al.
Published: (2025)
by: He, Jing, et al.
Published: (2025)
SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding
by: Ma, Jiefeng, et al.
Published: (2024)
by: Ma, Jiefeng, et al.
Published: (2024)
HumanVideo-MME: Benchmarking MLLMs for Human-Centric Video Understanding
by: Cai, Yuxuan, et al.
Published: (2025)
by: Cai, Yuxuan, et al.
Published: (2025)
HERM: Benchmarking and Enhancing Multimodal LLMs for Human-Centric Understanding
by: Li, Keliang, et al.
Published: (2024)
by: Li, Keliang, et al.
Published: (2024)
Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input
by: Ott, Joachim, et al.
Published: (2024)
by: Ott, Joachim, et al.
Published: (2024)
Characterizing Personality from Eye-Tracking: The Role of Gaze and Its Absence in Interactive Search Environments
by: He, Jiaman, et al.
Published: (2026)
by: He, Jiaman, et al.
Published: (2026)
LLMTemporalComparator: A Tool for Analysing Differences in Temporal Adaptations of Large Language Models
by: Fritsch, Reinhard Friedrich, et al.
Published: (2024)
by: Fritsch, Reinhard Friedrich, et al.
Published: (2024)
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
by: Chen, Liyang, et al.
Published: (2025)
by: Chen, Liyang, et al.
Published: (2025)
STMGF: An Effective Spatial-Temporal Multi-Granularity Framework for Traffic Forecasting
by: Zhao, Zhengyang, et al.
Published: (2024)
by: Zhao, Zhengyang, et al.
Published: (2024)
RANDPOL: Parameter-Efficient End-to-End Quadruped Locomotion via Randomized Policy Learning
by: Liu, Zhuochen, et al.
Published: (2025)
by: Liu, Zhuochen, et al.
Published: (2025)
Model-Free Neural Filtering: A Comparison with Classical Filters in Nonlinear Systems
by: Liu, Zhuochen, et al.
Published: (2026)
by: Liu, Zhuochen, et al.
Published: (2026)
HumanVBench: Probing Human-Centric Video Understanding in MLLMs with Automatically Synthesized Benchmarks
by: Zhou, Ting, et al.
Published: (2024)
by: Zhou, Ting, et al.
Published: (2024)
Relational Object-Centric Actor-Critic
by: Ugadiarov, Leonid, et al.
Published: (2023)
by: Ugadiarov, Leonid, et al.
Published: (2023)
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
by: Tang, Jingqun, et al.
Published: (2024)
by: Tang, Jingqun, et al.
Published: (2024)
LLM-Centric RAG with Multi-Granular Indexing and Confidence Constraints
by: Guo, Xiaofan, et al.
Published: (2025)
by: Guo, Xiaofan, et al.
Published: (2025)
Text-space Graph Foundation Models: Comprehensive Benchmarks and New Insights
by: Chen, Zhikai, et al.
Published: (2024)
by: Chen, Zhikai, et al.
Published: (2024)
IMPACT: A Dataset for Multi-Granularity Human Procedural Action Understanding in Industrial Assembly
by: Wen, Di, et al.
Published: (2026)
by: Wen, Di, et al.
Published: (2026)
Pretraining Exposure Explains Popularity Judgments in Large Language Models
by: Mozafari, Jamshid, et al.
Published: (2026)
by: Mozafari, Jamshid, et al.
Published: (2026)
Similar Items
-
TEMPO: A Realistic Multi-Domain Benchmark for Temporal Reasoning-Intensive Retrieval
by: Abdallah, Abdelrahman, et al.
Published: (2026) -
PARSE: An Open-Domain Reasoning Question Answering Benchmark for Persian
by: Mozafari, Jamshid, et al.
Published: (2026) -
Wisdom of the Crowds in Forecasting: Forecast Summarization for Supporting Future Event Prediction
by: Saha, Anisha, et al.
Published: (2025) -
Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction
by: Nako, Petraq, et al.
Published: (2025) -
Generator-Retriever-Generator Approach for Open-Domain Question Answering
by: Abdallah, Abdelrahman, et al.
Published: (2023)