Saved in:
| Main Authors: | Begin, James, Agrawal, Namit, Singh, Eshan, Fu, Yicheng, O'Brien, Sean, Sharma, Vasu, Zhu, Kevin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.20405 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
NovelHopQA: Diagnosing Multi-Hop Reasoning Failures in Long Narrative Contexts
by: Gupta, Abhay, et al.
Published: (2025)
by: Gupta, Abhay, et al.
Published: (2025)
Sarc7: Evaluating Sarcasm Detection and Generation with Seven Types and Emotion-Informed Techniques
by: Xiong, Lang, et al.
Published: (2025)
by: Xiong, Lang, et al.
Published: (2025)
ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models
by: Khalid, Haziq Mohammad, et al.
Published: (2025)
by: Khalid, Haziq Mohammad, et al.
Published: (2025)
WOLF: Werewolf-based Observations for LLM Deception and Falsehoods
by: Agarwal, Mrinal, et al.
Published: (2025)
by: Agarwal, Mrinal, et al.
Published: (2025)
Reasoning Relay: Evaluating Stability and Interchangeability of Large Language Models in Mathematical Reasoning
by: Lu, Leo, et al.
Published: (2025)
by: Lu, Leo, et al.
Published: (2025)
SMAGDi: Socratic Multi Agent Interaction Graph Distillation for Efficient High Accuracy Reasoning
by: Aluru, Aayush, et al.
Published: (2025)
by: Aluru, Aayush, et al.
Published: (2025)
Pruning for Performance: Efficient Idiom and Metaphor Classification in Low-Resource Konkani Using mBERT
by: Do, Timothy, et al.
Published: (2025)
by: Do, Timothy, et al.
Published: (2025)
Question-Analysis Prompting Improves LLM Performance in Reasoning Tasks
by: Yugeswardeenoo, Dharunish, et al.
Published: (2024)
by: Yugeswardeenoo, Dharunish, et al.
Published: (2024)
FAIRE: Assessing Racial and Gender Bias in AI-Driven Resume Evaluations
by: Wen, Athena, et al.
Published: (2025)
by: Wen, Athena, et al.
Published: (2025)
Causal Language Control in Multilingual Transformers via Sparse Feature Steering
by: Chou, Cheng-Ting, et al.
Published: (2025)
by: Chou, Cheng-Ting, et al.
Published: (2025)
Universal Neurons in GPT-2: Emergence, Persistence, and Functional Impact
by: Nandan, Advey, et al.
Published: (2025)
by: Nandan, Advey, et al.
Published: (2025)
Adaptive Originality Filtering: Rejection Based Prompting and RiddleScore for Culturally Grounded Multilingual Riddle Generation
by: Le, Duy, et al.
Published: (2025)
by: Le, Duy, et al.
Published: (2025)
From Directions to Cones: Exploring Multidimensional Representations of Propositional Facts in LLMs
by: Yu, Stanley, et al.
Published: (2025)
by: Yu, Stanley, et al.
Published: (2025)
Distill CLIP (DCLIP): Enhancing Image-Text Retrieval via Cross-Modal Transformer Distillation
by: Csizmadia, Daniel, et al.
Published: (2025)
by: Csizmadia, Daniel, et al.
Published: (2025)
TRUTH DECAY: Quantifying Multi-Turn Sycophancy in Language Models
by: Liu, Joshua, et al.
Published: (2025)
by: Liu, Joshua, et al.
Published: (2025)
From Competition to Coordination: Market Making as a Scalable Framework for Safe and Aligned Multi-Agent LLM Systems
by: Gho, Brendan, et al.
Published: (2025)
by: Gho, Brendan, et al.
Published: (2025)
The Geometry of Harmfulness in LLMs through Subconcept Probing
by: Shah, McNair, et al.
Published: (2025)
by: Shah, McNair, et al.
Published: (2025)
Deconstructing Bias: A Multifaceted Framework for Diagnosing Cultural and Compositional Inequities in Text-to-Image Generative Models
by: Said, Muna Numan, et al.
Published: (2025)
by: Said, Muna Numan, et al.
Published: (2025)
Direct Confidence Alignment: Aligning Verbalized Confidence with Internal Confidence In Large Language Models
by: Zhang, Glenn, et al.
Published: (2025)
by: Zhang, Glenn, et al.
Published: (2025)
COREVQA: A Crowd Observation and Reasoning Entailment Visual Question Answering Benchmark
by: Chintapatla, Ishant, et al.
Published: (2025)
by: Chintapatla, Ishant, et al.
Published: (2025)
Rosetta-PL: Propositional Logic as a Benchmark for Large Language Model Reasoning
by: Baek, Shaun, et al.
Published: (2025)
by: Baek, Shaun, et al.
Published: (2025)
Advancing Uto-Aztecan Language Technologies: A Case Study on the Endangered Comanche Language
by: C, Jesus Alvarez, et al.
Published: (2025)
by: C, Jesus Alvarez, et al.
Published: (2025)
Interpreting the Latent Structure of Operator Precedence in Language Models
by: Yugeswardeenoo, Dharunish, et al.
Published: (2025)
by: Yugeswardeenoo, Dharunish, et al.
Published: (2025)
AAVENUE: Detecting LLM Biases on NLU Tasks in AAVE via a Novel Benchmark
by: Gupta, Abhay, et al.
Published: (2024)
by: Gupta, Abhay, et al.
Published: (2024)
Implementing Open Approaches in the School.
by: O'Brien, E.
Published: (1977)
by: O'Brien, E.
Published: (1977)
ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems
by: Singh, Ishneet Sukhvinder, et al.
Published: (2024)
by: Singh, Ishneet Sukhvinder, et al.
Published: (2024)
Traversing European Coastlines (TREC) particle count and meterological data from land (2023-2024)
by: O'Brien, James
Published: (2026)
by: O'Brien, James
Published: (2026)
Error Reflection Prompting: Can Large Language Models Successfully Understand Errors?
by: Li, Jason, et al.
Published: (2025)
by: Li, Jason, et al.
Published: (2025)
CLEAR: Contrasting Textual Feedback with Experts and Amateurs for Reasoning
by: Rufail, Andrew, et al.
Published: (2025)
by: Rufail, Andrew, et al.
Published: (2025)
Improving LLM Abilities in Idiomatic Translation
by: Donthi, Sundesh, et al.
Published: (2024)
by: Donthi, Sundesh, et al.
Published: (2024)
Sistemas de información gerencial / James A. O’Brien, George M. Marakas ; traducción, María Jesús Herrero Díaz, Miguel µngel S nchez Carrión
by: O’Brien, James A
by: O’Brien, James A
Federated Stream-Processing and Latency-Gated Response for Cross-Sector Threat Detection and Collaborative Containment
by: Mohale, Namit
Published: (2026)
by: Mohale, Namit
Published: (2026)
In-In EFT
by: Mahajan, Namit
Published: (2025)
by: Mahajan, Namit
Published: (2025)
MALIBU Benchmark: Multi-Agent LLM Implicit Bias Uncovered
by: Mirza, Imran, et al.
Published: (2025)
by: Mirza, Imran, et al.
Published: (2025)
Ultrafast Superconducting Qubit Readout with the Quarton Coupler
by: Ye, Yufeng, et al.
Published: (2024)
by: Ye, Yufeng, et al.
Published: (2024)
From Bias to Balance: Detecting Facial Expression Recognition Biases in Large Multimodal Foundation Models
by: Chhua, Kaylee, et al.
Published: (2024)
by: Chhua, Kaylee, et al.
Published: (2024)
Revolutionizing Early Detection: A Comprehensive Cognitive Screening Tool for ADRD Through Innovative Mobile App Technology
by: Lenora W Smith, et al.
Published: (2024)
by: Lenora W Smith, et al.
Published: (2024)
The Ethics of AI generated artworks
by: Eshan Divecha
Published: (2022)
by: Eshan Divecha
Published: (2022)
Using a Capability Approach to Explore How People With Intellectual Disabilities Can Lead Flourishing Lives
by: Sara Ryan, et al.
Published: (2024)
by: Sara Ryan, et al.
Published: (2024)
A Few Bad Neurons: Isolating and Surgically Correcting Sycophancy
by: O'Brien, Claire, et al.
Published: (2026)
by: O'Brien, Claire, et al.
Published: (2026)
Similar Items
-
NovelHopQA: Diagnosing Multi-Hop Reasoning Failures in Long Narrative Contexts
by: Gupta, Abhay, et al.
Published: (2025) -
Sarc7: Evaluating Sarcasm Detection and Generation with Seven Types and Emotion-Informed Techniques
by: Xiong, Lang, et al.
Published: (2025) -
ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models
by: Khalid, Haziq Mohammad, et al.
Published: (2025) -
WOLF: Werewolf-based Observations for LLM Deception and Falsehoods
by: Agarwal, Mrinal, et al.
Published: (2025) -
Reasoning Relay: Evaluating Stability and Interchangeability of Large Language Models in Mathematical Reasoning
by: Lu, Leo, et al.
Published: (2025)