Saved in:
| Main Authors: | Galanti, Liane, Baron, Ethan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.18968 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
From Grounding to Planning: Benchmarking Bottlenecks in Web Agents
by: Shlomov, Segev, et al.
Published: (2024)
by: Shlomov, Segev, et al.
Published: (2024)
AgentFixer: From Failure Detection to Fix Recommendations in LLM Agentic Systems
by: Mulian, Hadar, et al.
Published: (2026)
by: Mulian, Hadar, et al.
Published: (2026)
The Fair Language Model Paradox
by: Pinto, Andrea, et al.
Published: (2024)
by: Pinto, Andrea, et al.
Published: (2024)
Building Artificial Intelligence with Creative Agency and Self-hood
by: Gabora, Liane, et al.
Published: (2024)
by: Gabora, Liane, et al.
Published: (2024)
Tool Building as a Path to "Superintelligence"
by: Koplow, David, et al.
Published: (2026)
by: Koplow, David, et al.
Published: (2026)
Agentic Systems as Boosting Weak Reasoning Models
by: Sunkaraneni, Varun, et al.
Published: (2026)
by: Sunkaraneni, Varun, et al.
Published: (2026)
Evaluating the Meta- and Object-Level Reasoning of Large Language Models for Question Answering
by: Ferguson, Nick, et al.
Published: (2025)
by: Ferguson, Nick, et al.
Published: (2025)
Directional Neural Collapse Explains Few-Shot Transfer in Self-Supervised Learning
by: Luthra, Achleshwar, et al.
Published: (2026)
by: Luthra, Achleshwar, et al.
Published: (2026)
Distribution-Aware Algorithm Design with LLM Agents
by: Koganti, Saharsh, et al.
Published: (2026)
by: Koganti, Saharsh, et al.
Published: (2026)
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
by: Li, Gang, et al.
Published: (2025)
by: Li, Gang, et al.
Published: (2025)
Generalization or Memorization? Brittleness Testing for Chess-Trained Language Models
by: Tang, Ethan
Published: (2026)
by: Tang, Ethan
Published: (2026)
Does Self-Evaluation Enable Wireheading in Language Models?
by: Africa, David Demitri, et al.
Published: (2025)
by: Africa, David Demitri, et al.
Published: (2025)
The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?
by: Hägele, Alexander, et al.
Published: (2026)
by: Hägele, Alexander, et al.
Published: (2026)
Are Audio-Language Models Listening? Audio-Specialist Heads for Adaptive Audio Steering
by: Glazer, Neta, et al.
Published: (2026)
by: Glazer, Neta, et al.
Published: (2026)
Language Models can Self-Improve at State-Value Estimation for Better Search
by: Mendes, Ethan, et al.
Published: (2025)
by: Mendes, Ethan, et al.
Published: (2025)
Anticipatory Evaluation of Language Models
by: Park, Jungsoo, et al.
Published: (2025)
by: Park, Jungsoo, et al.
Published: (2025)
Knowledge Tagging with Large Language Model based Multi-Agent System
by: Li, Hang, et al.
Published: (2024)
by: Li, Hang, et al.
Published: (2024)
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
by: Rocamonde, Juan, et al.
Published: (2023)
by: Rocamonde, Juan, et al.
Published: (2023)
A Unified Assessment of the Poverty of the Stimulus Argument for Neural Language Models
by: Yang, Xiulin, et al.
Published: (2026)
by: Yang, Xiulin, et al.
Published: (2026)
Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model Inference
by: Timor, Nadav, et al.
Published: (2024)
by: Timor, Nadav, et al.
Published: (2024)
Uncovering Latent Bias in LLM-Based Emergency Department Triage Through Proxy Variables
by: Zhang, Ethan
Published: (2026)
by: Zhang, Ethan
Published: (2026)
Jailbreaking Large Vision Language Models in Intelligent Transportation Systems
by: Das, Badhan Chandra, et al.
Published: (2025)
by: Das, Badhan Chandra, et al.
Published: (2025)
Multi-Agent Collaborative Intelligence: Dual-Dial Control for Reliable LLM Reasoning
by: Chang, Edward Y., et al.
Published: (2025)
by: Chang, Edward Y., et al.
Published: (2025)
Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate
by: Chern, Steffi, et al.
Published: (2024)
by: Chern, Steffi, et al.
Published: (2024)
How to Measure the Intelligence of Large Language Models?
by: Körber, Nils, et al.
Published: (2024)
by: Körber, Nils, et al.
Published: (2024)
The Unified Cognitive Consciousness Theory for Language Models: Anchoring Semantics, Thresholds of Activation, and Emergent Reasoning
by: Chang, Edward Y., et al.
Published: (2025)
by: Chang, Edward Y., et al.
Published: (2025)
Structured Cognitive Loop for Behavioral Intelligence in Large Language Model Agents
by: Kim, Myung Ho
Published: (2025)
by: Kim, Myung Ho
Published: (2025)
TMIQ: Quantifying Test and Measurement Domain Intelligence in Large Language Models
by: Olowe, Emmanuel A., et al.
Published: (2025)
by: Olowe, Emmanuel A., et al.
Published: (2025)
Can LLMs Detect Intrinsic Hallucinations in Paraphrasing and Machine Translation?
by: Gogoulou, Evangelia, et al.
Published: (2025)
by: Gogoulou, Evangelia, et al.
Published: (2025)
Diagnostics of cognitive failures in multi-agent expert systems using dynamic evaluation protocols and subsequent mutation of the processing context
by: Sorstkins, Andrejs, et al.
Published: (2025)
by: Sorstkins, Andrejs, et al.
Published: (2025)
Evaluating and Enhancing Large Language Models for Novelty Assessment in Scholarly Publications
by: Lin, Ethan, et al.
Published: (2024)
by: Lin, Ethan, et al.
Published: (2024)
Function Words as Statistical Cues for Language Learning
by: Yang, Xiulin, et al.
Published: (2026)
by: Yang, Xiulin, et al.
Published: (2026)
Asymptotic and Finite Sample Analysis of Nonexpansive Stochastic Approximations with Markovian Noise
by: Blaser, Ethan, et al.
Published: (2024)
by: Blaser, Ethan, et al.
Published: (2024)
Artificial Intelligence as Strange Intelligence: Against Linear Models of Intelligence
by: Chilson, Kendra, et al.
Published: (2026)
by: Chilson, Kendra, et al.
Published: (2026)
Eureka-Audio: Triggering Audio Intelligence in Compact Language Models
by: Zhang, Dan, et al.
Published: (2026)
by: Zhang, Dan, et al.
Published: (2026)
Language Models Learn Constructional Semantics, Not To Mention Syntax: Investigating LM Understanding of Paired-Focus Constructions
by: Scivetti, Wesley, et al.
Published: (2026)
by: Scivetti, Wesley, et al.
Published: (2026)
Apple Intelligence Foundation Language Models
by: Gunter, Tom, et al.
Published: (2024)
by: Gunter, Tom, et al.
Published: (2024)
Large Language Models and Scientific Discourse: Where's the Intelligence?
by: Collins, Harry, et al.
Published: (2026)
by: Collins, Harry, et al.
Published: (2026)
Recursive Symbolic Consciousness: A Formal Model of Emergent Intelligence Across Minds and Machines
by: Goudy, Anastasia
Published: (2025)
by: Goudy, Anastasia
Published: (2025)
Organizing a Society of Language Models: Structures and Mechanisms for Enhanced Collective Intelligence
by: Ferreira, Silvan, et al.
Published: (2024)
by: Ferreira, Silvan, et al.
Published: (2024)
Similar Items
-
From Grounding to Planning: Benchmarking Bottlenecks in Web Agents
by: Shlomov, Segev, et al.
Published: (2024) -
AgentFixer: From Failure Detection to Fix Recommendations in LLM Agentic Systems
by: Mulian, Hadar, et al.
Published: (2026) -
The Fair Language Model Paradox
by: Pinto, Andrea, et al.
Published: (2024) -
Building Artificial Intelligence with Creative Agency and Self-hood
by: Gabora, Liane, et al.
Published: (2024) -
Tool Building as a Path to "Superintelligence"
by: Koplow, David, et al.
Published: (2026)