:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Naim, Omar, Bhar, Swarnadeep, Bolte, Jérôme, Asher, Nicholas
Formato:	Preprint
Publicado:	2025
Materias:	Computation and Language
Acceso en línea:	https://arxiv.org/abs/2508.14685
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

Strong hallucinations from negation and how to fix them
por: Asher, Nicholas, et al.
Publicado: (2024)

COCORELI: Enforcing Execution Preconditions for Reliable Collaborative Instruction Following
por: Bhar, Swarnadeep, et al.
Publicado: (2025)

Analyzing limits for in-context learning
por: Naim, Omar, et al.
Publicado: (2025)

On Explaining with Attention Matrices
por: Naim, Omar, et al.
Publicado: (2024)

Re-examining learning linear functions in context
por: Naim, Omar, et al.
Publicado: (2024)

TELL-TALE: Task Efficient LLMs with Task Aware Layer Elimination
por: Naim, Omar, et al.
Publicado: (2025)

Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense
por: Tao, Leitian, et al.
Publicado: (2025)

ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs
por: Chen, Justin Chih-Yao, et al.
Publicado: (2023)

MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models
por: Chen, Justin Chih-Yao, et al.
Publicado: (2024)

Learning Semantic Structure through First-Order-Logic Translation
por: Chaturvedi, Akshay, et al.
Publicado: (2024)

Nebula: A discourse aware Minecraft Builder
por: Chaturvedi, Akshay, et al.
Publicado: (2024)

Branch-Solve-Merge Improves Large Language Model Evaluation and Generation
por: Saha, Swarnadeep, et al.
Publicado: (2023)

DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module
por: Sharma, Krish, et al.
Publicado: (2025)

Llamipa: An Incremental Discourse Parser
por: Thompson, Kate, et al.
Publicado: (2024)

Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge
por: Saha, Swarnadeep, et al.
Publicado: (2025)

Verifying Peephole Rewriting In SSA Compiler IRs
por: Bhat, Siddharth, et al.
Publicado: (2024)

Validity Arguments For Constructed Response Scoring Using Generative Artificial Intelligence Applications
por: Casabianca, Jodi M., et al.
Publicado: (2025)

From Feature-Based Models to Generative AI: Validity Evidence for Constructed Response Scoring
por: Casabianca, Jodi M., et al.
Publicado: (2026)

The Majority is not always right: RL training for solution aggregation
por: Zhao, Wenting, et al.
Publicado: (2025)

The Denotational Semantics of SSA
por: Ghalayini, Jad Elkhaleq, et al.
Publicado: (2024)

GistScore: Learning Better Representations for In-Context Example Selection with Gist Bottlenecks
por: Gupta, Shivanshu, et al.
Publicado: (2023)

Improving Large Models with Small models: Lower Costs and Better Performance
por: Chen, Dong, et al.
Publicado: (2024)

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
por: Shen, Zhenyi, et al.
Publicado: (2025)

MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning
por: Chen, Justin Chih-Yao, et al.
Publicado: (2024)

TaCo: Targeted Concept Erasure Prevents Non-Linear Classifiers From Detecting Protected Attributes
por: Jourdan, Fanny, et al.
Publicado: (2023)

Minimal Clips, Maximum Salience: Long Video Summarization via Key Moment Extraction
por: Pennec, Galann, et al.
Publicado: (2025)

Integrating Video and Text: A Balanced Approach to Multimodal Summary Generation and Evaluation
por: Pennec, Galann, et al.
Publicado: (2025)

Modality-Agnostic fMRI Decoding of Vision and Language
por: Nikolaus, Mitja, et al.
Publicado: (2024)

From Scores to Steps: Diagnosing and Improving LLM Performance in Evidence-Based Medical Calculations
por: Wang, Benlu, et al.
Publicado: (2025)

InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks
por: Banerjee, Somnath, et al.
Publicado: (2024)

How Does Beam Search improve Span-Level Confidence Estimation in Generative Sequence Labeling?
por: Hashimoto, Kazuma, et al.
Publicado: (2022)

Exploring Plan Space through Conversation: An Agentic Framework for LLM-Mediated Explanations in Planning
por: Fouilhé, Guilhem, et al.
Publicado: (2026)

CIC-BART-SSA: Controllable Image Captioning with Structured Semantic Augmentation
por: Basioti, Kalliopi, et al.
Publicado: (2024)

Score Before You Speak: Improving Persona Consistency in Dialogue Generation using Response Quality Scores
por: Saggar, Arpita, et al.
Publicado: (2025)

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
por: Whitehouse, Chenxi, et al.
Publicado: (2025)

OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
por: Aggarwal, Pranjal, et al.
Publicado: (2025)

KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning?
por: Saha, Soumadeep, et al.
Publicado: (2025)

Making Large Language Models Perform Better in Knowledge Graph Completion
por: Zhang, Yichi, et al.
Publicado: (2023)

RECSIP: REpeated Clustering of Scores Improving the Precision
por: Schamschurko, André, et al.
Publicado: (2025)

Evaluating Cumulative Spectral Gradient as a Complexity Measure
por: Gul, Haji, et al.
Publicado: (2025)