Guardado en:
| Autores principales: | Naim, Omar, Bhar, Swarnadeep, Bolte, Jérôme, Asher, Nicholas |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2508.14685 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Strong hallucinations from negation and how to fix them
por: Asher, Nicholas, et al.
Publicado: (2024)
por: Asher, Nicholas, et al.
Publicado: (2024)
COCORELI: Enforcing Execution Preconditions for Reliable Collaborative Instruction Following
por: Bhar, Swarnadeep, et al.
Publicado: (2025)
por: Bhar, Swarnadeep, et al.
Publicado: (2025)
Analyzing limits for in-context learning
por: Naim, Omar, et al.
Publicado: (2025)
por: Naim, Omar, et al.
Publicado: (2025)
On Explaining with Attention Matrices
por: Naim, Omar, et al.
Publicado: (2024)
por: Naim, Omar, et al.
Publicado: (2024)
Re-examining learning linear functions in context
por: Naim, Omar, et al.
Publicado: (2024)
por: Naim, Omar, et al.
Publicado: (2024)
TELL-TALE: Task Efficient LLMs with Task Aware Layer Elimination
por: Naim, Omar, et al.
Publicado: (2025)
por: Naim, Omar, et al.
Publicado: (2025)
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense
por: Tao, Leitian, et al.
Publicado: (2025)
por: Tao, Leitian, et al.
Publicado: (2025)
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs
por: Chen, Justin Chih-Yao, et al.
Publicado: (2023)
por: Chen, Justin Chih-Yao, et al.
Publicado: (2023)
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models
por: Chen, Justin Chih-Yao, et al.
Publicado: (2024)
por: Chen, Justin Chih-Yao, et al.
Publicado: (2024)
Learning Semantic Structure through First-Order-Logic Translation
por: Chaturvedi, Akshay, et al.
Publicado: (2024)
por: Chaturvedi, Akshay, et al.
Publicado: (2024)
Nebula: A discourse aware Minecraft Builder
por: Chaturvedi, Akshay, et al.
Publicado: (2024)
por: Chaturvedi, Akshay, et al.
Publicado: (2024)
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation
por: Saha, Swarnadeep, et al.
Publicado: (2023)
por: Saha, Swarnadeep, et al.
Publicado: (2023)
DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module
por: Sharma, Krish, et al.
Publicado: (2025)
por: Sharma, Krish, et al.
Publicado: (2025)
Llamipa: An Incremental Discourse Parser
por: Thompson, Kate, et al.
Publicado: (2024)
por: Thompson, Kate, et al.
Publicado: (2024)
Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge
por: Saha, Swarnadeep, et al.
Publicado: (2025)
por: Saha, Swarnadeep, et al.
Publicado: (2025)
Verifying Peephole Rewriting In SSA Compiler IRs
por: Bhat, Siddharth, et al.
Publicado: (2024)
por: Bhat, Siddharth, et al.
Publicado: (2024)
Validity Arguments For Constructed Response Scoring Using Generative Artificial Intelligence Applications
por: Casabianca, Jodi M., et al.
Publicado: (2025)
por: Casabianca, Jodi M., et al.
Publicado: (2025)
From Feature-Based Models to Generative AI: Validity Evidence for Constructed Response Scoring
por: Casabianca, Jodi M., et al.
Publicado: (2026)
por: Casabianca, Jodi M., et al.
Publicado: (2026)
The Majority is not always right: RL training for solution aggregation
por: Zhao, Wenting, et al.
Publicado: (2025)
por: Zhao, Wenting, et al.
Publicado: (2025)
The Denotational Semantics of SSA
por: Ghalayini, Jad Elkhaleq, et al.
Publicado: (2024)
por: Ghalayini, Jad Elkhaleq, et al.
Publicado: (2024)
GistScore: Learning Better Representations for In-Context Example Selection with Gist Bottlenecks
por: Gupta, Shivanshu, et al.
Publicado: (2023)
por: Gupta, Shivanshu, et al.
Publicado: (2023)
Improving Large Models with Small models: Lower Costs and Better Performance
por: Chen, Dong, et al.
Publicado: (2024)
por: Chen, Dong, et al.
Publicado: (2024)
SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
por: Shen, Zhenyi, et al.
Publicado: (2025)
por: Shen, Zhenyi, et al.
Publicado: (2025)
MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning
por: Chen, Justin Chih-Yao, et al.
Publicado: (2024)
por: Chen, Justin Chih-Yao, et al.
Publicado: (2024)
TaCo: Targeted Concept Erasure Prevents Non-Linear Classifiers From Detecting Protected Attributes
por: Jourdan, Fanny, et al.
Publicado: (2023)
por: Jourdan, Fanny, et al.
Publicado: (2023)
Minimal Clips, Maximum Salience: Long Video Summarization via Key Moment Extraction
por: Pennec, Galann, et al.
Publicado: (2025)
por: Pennec, Galann, et al.
Publicado: (2025)
Integrating Video and Text: A Balanced Approach to Multimodal Summary Generation and Evaluation
por: Pennec, Galann, et al.
Publicado: (2025)
por: Pennec, Galann, et al.
Publicado: (2025)
Modality-Agnostic fMRI Decoding of Vision and Language
por: Nikolaus, Mitja, et al.
Publicado: (2024)
por: Nikolaus, Mitja, et al.
Publicado: (2024)
From Scores to Steps: Diagnosing and Improving LLM Performance in Evidence-Based Medical Calculations
por: Wang, Benlu, et al.
Publicado: (2025)
por: Wang, Benlu, et al.
Publicado: (2025)
InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks
por: Banerjee, Somnath, et al.
Publicado: (2024)
por: Banerjee, Somnath, et al.
Publicado: (2024)
How Does Beam Search improve Span-Level Confidence Estimation in Generative Sequence Labeling?
por: Hashimoto, Kazuma, et al.
Publicado: (2022)
por: Hashimoto, Kazuma, et al.
Publicado: (2022)
Exploring Plan Space through Conversation: An Agentic Framework for LLM-Mediated Explanations in Planning
por: Fouilhé, Guilhem, et al.
Publicado: (2026)
por: Fouilhé, Guilhem, et al.
Publicado: (2026)
CIC-BART-SSA: Controllable Image Captioning with Structured Semantic Augmentation
por: Basioti, Kalliopi, et al.
Publicado: (2024)
por: Basioti, Kalliopi, et al.
Publicado: (2024)
Score Before You Speak: Improving Persona Consistency in Dialogue Generation using Response Quality Scores
por: Saggar, Arpita, et al.
Publicado: (2025)
por: Saggar, Arpita, et al.
Publicado: (2025)
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
por: Whitehouse, Chenxi, et al.
Publicado: (2025)
por: Whitehouse, Chenxi, et al.
Publicado: (2025)
OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
por: Aggarwal, Pranjal, et al.
Publicado: (2025)
por: Aggarwal, Pranjal, et al.
Publicado: (2025)
KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning?
por: Saha, Soumadeep, et al.
Publicado: (2025)
por: Saha, Soumadeep, et al.
Publicado: (2025)
Making Large Language Models Perform Better in Knowledge Graph Completion
por: Zhang, Yichi, et al.
Publicado: (2023)
por: Zhang, Yichi, et al.
Publicado: (2023)
RECSIP: REpeated Clustering of Scores Improving the Precision
por: Schamschurko, André, et al.
Publicado: (2025)
por: Schamschurko, André, et al.
Publicado: (2025)
Evaluating Cumulative Spectral Gradient as a Complexity Measure
por: Gul, Haji, et al.
Publicado: (2025)
por: Gul, Haji, et al.
Publicado: (2025)
Ejemplares similares
-
Strong hallucinations from negation and how to fix them
por: Asher, Nicholas, et al.
Publicado: (2024) -
COCORELI: Enforcing Execution Preconditions for Reliable Collaborative Instruction Following
por: Bhar, Swarnadeep, et al.
Publicado: (2025) -
Analyzing limits for in-context learning
por: Naim, Omar, et al.
Publicado: (2025) -
On Explaining with Attention Matrices
por: Naim, Omar, et al.
Publicado: (2024) -
Re-examining learning linear functions in context
por: Naim, Omar, et al.
Publicado: (2024)