:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Wang, Zhen, Bai, Fan, Luo, Zhongyan, Su, Jinyan, Sun, Kaiser, Yu, Xinle, Liu, Jieyuan, Zhou, Kun, Cardie, Claire, Dredze, Mark, Xing, Eric P., Hu, Zhiting
Natura:	Preprint
Pubblicazione:	2026
Soggetti:	Artificial Intelligence
Accesso online:	https://arxiv.org/abs/2602.02905
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Knowing but Not Showing: LLMs Recognize Ambiguity but Rarely Ask Clarifying Questions
di: Su, Jinyan, et al.
Pubblicazione: (2026)

Thinking Fast and Right: Balancing Accuracy and Reasoning Length with Adaptive Rewards
di: Su, Jinyan, et al.
Pubblicazione: (2025)

Multi-Hop Question Answering: When Can Humans Help, and Where do They Struggle?
di: Su, Jinyan, et al.
Pubblicazione: (2025)

Adapting Fake News Detection to the Era of Large Language Models
di: Su, Jinyan, et al.
Pubblicazione: (2023)

Corpus Poisoning via Approximate Greedy Gradient Descent
di: Su, Jinyan, et al.
Pubblicazione: (2024)

Task Matters: Knowledge Requirements Shape LLM Responses to Context-Memory Conflict
di: Sun, Kaiser, et al.
Pubblicazione: (2025)

Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs
di: Su, Jinyan, et al.
Pubblicazione: (2025)

Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control
di: Su, Jinyan, et al.
Pubblicazione: (2025)

CLIPer: Tailoring Diverse User Preference via Classifier-Guided Inference-Time Personalization
di: Su, Jinyan, et al.
Pubblicazione: (2026)

Towards More Robust Retrieval-Augmented Generation: Evaluating RAG Under Adversarial Poisoning Attacks
di: Su, Jinyan, et al.
Pubblicazione: (2024)

Amuro and Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models
di: Sun, Kaiser, et al.
Pubblicazione: (2024)

Reasoning Court: Combining Reasoning, Action, and Judgment for Multi-Hop Reasoning
di: Wu, Jingtian, et al.
Pubblicazione: (2025)

LLMs are Better Than You Think: Label-Guided In-Context Learning for Named Entity Recognition
di: Bai, Fan, et al.
Pubblicazione: (2025)

Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs
di: Sun, Kaiser, et al.
Pubblicazione: (2026)

Evaluating the Evaluators: Are readability metrics good measures of readability?
di: Cachola, Isabel, et al.
Pubblicazione: (2025)

Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models
di: Yin, Yanbin, et al.
Pubblicazione: (2025)

On the Failure of Latent State Persistence in Large Language Models
di: Huang, Jen-tse, et al.
Pubblicazione: (2025)

Evaluating Implicit Biases in LLM Reasoning through Logic Grid Puzzles
di: Jahara, Fatima, et al.
Pubblicazione: (2025)

scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery
di: Gao, Yiming, et al.
Pubblicazione: (2026)

HAPO: Training Language Models to Reason Concisely via History-Aware Policy Optimization
di: Huang, Chengyu, et al.
Pubblicazione: (2025)

Transferring Fairness using Multi-Task Learning with Limited Demographic Information
di: Aguirre, Carlos, et al.
Pubblicazione: (2023)

MASH: Modeling Abstention via Selective Help-Seeking
di: Gul, Mustafa Omer, et al.
Pubblicazione: (2025)

Give me Some Hard Questions: Synthetic Data Generation for Clinical QA
di: Bai, Fan, et al.
Pubblicazione: (2024)

Schema-Driven Information Extraction from Heterogeneous Tables
di: Bai, Fan, et al.
Pubblicazione: (2023)

Can one size fit all?: Measuring Failure in Multi-Document Summarization Domain Transfer
di: DeLucia, Alexandra, et al.
Pubblicazione: (2025)

RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models
di: An, Bang, et al.
Pubblicazione: (2025)

Rediscovery
di: Banchio, Martino, et al.
Pubblicazione: (2025)

FairMT-Bench: Benchmarking Fairness for Multi-turn Dialogue in Conversational LLMs
di: Fan, Zhiting, et al.
Pubblicazione: (2024)

Driving Strategy Using an Improved Ant Colony System for Energy‐Efficient Train
di: Chengda Yang, et al.
Pubblicazione: (2024)

CellMaster: Collaborative Cell Type Annotation in Single-Cell Analysis
di: Wang, Zhen, et al.
Pubblicazione: (2026)

How Far Are We From True Auto-Research?
di: Zhang, Zhengxin, et al.
Pubblicazione: (2026)

CocoaBench: Evaluating Unified Digital Agents in the Wild
di: CocoaBench Team, et al.
Pubblicazione: (2026)

Can Optimization Trajectories Explain Multi-Task Transfer?
di: Mueller, David, et al.
Pubblicazione: (2024)

Token-weighted Direct Preference Optimization with Attention
di: Huang, Chengyu, et al.
Pubblicazione: (2026)

I Could've Asked That: Reformulating Unanswerable Questions
di: Zhao, Wenting, et al.
Pubblicazione: (2024)

Better LLM Reasoning via Dual-Play
di: Zhang, Zhengxin, et al.
Pubblicazione: (2025)

Bootstrapping Post-training Signals for Open-ended Tasks via Rubric-based Self-play on Pre-training Text
di: Huang, Chengyu, et al.
Pubblicazione: (2026)

Critiques of World Models
di: Xing, Eric, et al.
Pubblicazione: (2025)

General Agentic Planning Through Simulative Reasoning with World Models
di: Deng, Mingkai, et al.
Pubblicazione: (2025)

AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery
di: Xiong, Lei, et al.
Pubblicazione: (2026)