Saved in:
| Main Authors: | Xu, Weijie, Cui, Shixian, Fang, Xi, Xue, Chi, Eckman, Stephanie, Reddy, Chandan K. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.00643 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense
by: Zhang, Zhehao, et al.
Published: (2025)
by: Zhang, Zhehao, et al.
Published: (2025)
The Personalization Trap: How User Memory Alters Emotional Reasoning in LLMs
by: Fang, Xi, et al.
Published: (2025)
by: Fang, Xi, et al.
Published: (2025)
Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective
by: Xu, Weijie, et al.
Published: (2025)
by: Xu, Weijie, et al.
Published: (2025)
A Comparative Study of Feature Selection in Tsetlin Machines
by: Halenka, Vojtech, et al.
Published: (2025)
by: Halenka, Vojtech, et al.
Published: (2025)
Benchmarking Energy Efficiency of Large Language Models Using vLLM
by: Pronk, K., et al.
Published: (2025)
by: Pronk, K., et al.
Published: (2025)
ContractEval: A Benchmark for Evaluating Contract-Satisfying Assertions in Code Generation
by: Lim, Soohan, et al.
Published: (2025)
by: Lim, Soohan, et al.
Published: (2025)
An Epidemiological Knowledge Graph extracted from the World Health Organization's Disease Outbreak News
by: Consoli, Sergio, et al.
Published: (2025)
by: Consoli, Sergio, et al.
Published: (2025)
TSDS: Data Selection for Task-Specific Model Finetuning
by: Liu, Zifan, et al.
Published: (2024)
by: Liu, Zifan, et al.
Published: (2024)
AgentLeak: A Full-Stack Benchmark for Privacy Leakage in Multi-Agent LLM Systems
by: Yagoubi, Faouzi El, et al.
Published: (2026)
by: Yagoubi, Faouzi El, et al.
Published: (2026)
Solving Zebra Puzzles Using Constraint-Guided Multi-Agent Systems
by: Berman, Shmuel, et al.
Published: (2024)
by: Berman, Shmuel, et al.
Published: (2024)
Low-Resource Neural Machine Translation Using Recurrent Neural Networks and Transfer Learning: A Case Study on English-to-Igbo
by: Ekle, Ocheme Anthony, et al.
Published: (2025)
by: Ekle, Ocheme Anthony, et al.
Published: (2025)
From MTEB to MTOB: Retrieval-Augmented Classification for Descriptive Grammars
by: Kornilov, Albert, et al.
Published: (2024)
by: Kornilov, Albert, et al.
Published: (2024)
Project Synapse: A Hierarchical Multi-Agent Framework with Hybrid Memory for Autonomous Resolution of Last-Mile Delivery Disruptions
by: Yadav, Arin Gopalan, et al.
Published: (2026)
by: Yadav, Arin Gopalan, et al.
Published: (2026)
Reasoning aligns language models to human cognition
by: Guiomar, Gonçalo, et al.
Published: (2026)
by: Guiomar, Gonçalo, et al.
Published: (2026)
Benchmarking Deception Probes via Black-to-White Performance Boosts
by: Parrack, Avi, et al.
Published: (2025)
by: Parrack, Avi, et al.
Published: (2025)
Controlling Long-Horizon Behavior in Language Model Agents with Explicit State Dynamics
by: Subaharan, Sukesh
Published: (2026)
by: Subaharan, Sukesh
Published: (2026)
Vibe-Creation: The Epistemology of Human-AI Emergent Cognition
by: Levin, Ilya
Published: (2026)
by: Levin, Ilya
Published: (2026)
Navigational Thinking as an Emerging Paradigm of Computer Science in the Age of Generative AI
by: Levin, Ilya
Published: (2026)
by: Levin, Ilya
Published: (2026)
Large Language Models are Inconsistent and Biased Evaluators
by: Stureborg, Rickard, et al.
Published: (2024)
by: Stureborg, Rickard, et al.
Published: (2024)
A transfer learning approach for automatic conflicts detection in software requirement sentence pairs based on dual encoders
by: Wang, Yizheng, et al.
Published: (2025)
by: Wang, Yizheng, et al.
Published: (2025)
An Automatic Text Classification Method Based on Hierarchical Taxonomies, Neural Networks and Document Embedding: The NETHIC Tool
by: Lomasto, Luigi, et al.
Published: (2026)
by: Lomasto, Luigi, et al.
Published: (2026)
Mind the Metrics: Patterns for Telemetry-Aware In-IDE AI Application Development using the Model Context Protocol (MCP)
by: Koc, Vincent, et al.
Published: (2025)
by: Koc, Vincent, et al.
Published: (2025)
Data and AI governance: Promoting equity, ethics, and fairness in large language models
by: Abhishek, Alok, et al.
Published: (2025)
by: Abhishek, Alok, et al.
Published: (2025)
SHARP: Social Harm Analysis via Risk Profiles for Measuring Inequities in Large Language Models
by: Abhishek, Alok, et al.
Published: (2026)
by: Abhishek, Alok, et al.
Published: (2026)
BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models
by: Abhishek, Alok, et al.
Published: (2025)
by: Abhishek, Alok, et al.
Published: (2025)
Reasoning Promotes Robustness in Theory of Mind Tasks
by: de Haan, Ian B., et al.
Published: (2026)
by: de Haan, Ian B., et al.
Published: (2026)
Epidemic Information Extraction for Event-Based Surveillance using Large Language Models
by: Consoli, Sergio, et al.
Published: (2024)
by: Consoli, Sergio, et al.
Published: (2024)
Tailoring Vaccine Messaging with Common-Ground Opinions
by: Stureborg, Rickard, et al.
Published: (2024)
by: Stureborg, Rickard, et al.
Published: (2024)
CUBO: Self-Contained Retrieval-Augmented Generation on Consumer Laptops 10 GB Corpora, 16 GB RAM, Single-Device Deployment
by: Astrino, Paolo
Published: (2026)
by: Astrino, Paolo
Published: (2026)
HumanMCP: A Human-Like Query Dataset for Evaluating MCP Tool Retrieval Performance
by: Laddha, Shubh, et al.
Published: (2025)
by: Laddha, Shubh, et al.
Published: (2025)
RHealthTwin: Towards Responsible and Multimodal Digital Twins for Personalized Well-being
by: Ferdousi, Rahatara, et al.
Published: (2025)
by: Ferdousi, Rahatara, et al.
Published: (2025)
Triplètoile: Extraction of Knowledge from Microblogging Text
by: Zavarella, Vanni, et al.
Published: (2024)
by: Zavarella, Vanni, et al.
Published: (2024)
On Adversarial Examples for Text Classification by Perturbing Latent Representations
by: Sooksatra, Korn, et al.
Published: (2024)
by: Sooksatra, Korn, et al.
Published: (2024)
Morphological Synthesizer for Ge'ez Language: Addressing Morphological Complexity and Resource Limitations
by: Gebremariam, Gebrearegawi, et al.
Published: (2025)
by: Gebremariam, Gebrearegawi, et al.
Published: (2025)
KNIGHT: Knowledge Graph-Driven Multiple-Choice Question Generation with Adaptive Hardness Calibration
by: Amanlou, Mohammad, et al.
Published: (2026)
by: Amanlou, Mohammad, et al.
Published: (2026)
Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey
by: Fang, Xi, et al.
Published: (2024)
by: Fang, Xi, et al.
Published: (2024)
AI Model for Predicting Binding Affinity of Antidiabetic Compounds Targeting PPAR
by: Aman, La Ode, et al.
Published: (2024)
by: Aman, La Ode, et al.
Published: (2024)
MORQA: Benchmarking Evaluation Metrics for Medical Open-Ended Question Answering
by: Yim, Wen-wai, et al.
Published: (2025)
by: Yim, Wen-wai, et al.
Published: (2025)
Murphys Laws of AI Alignment: Why the Gap Always Wins
by: Gaikwad, Madhava
Published: (2025)
by: Gaikwad, Madhava
Published: (2025)
PaperAudit-Bench: Benchmarking Error Detection in Research Papers for Critical Automated Peer Review
by: Tu, Songjun, et al.
Published: (2026)
by: Tu, Songjun, et al.
Published: (2026)
Similar Items
-
Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense
by: Zhang, Zhehao, et al.
Published: (2025) -
The Personalization Trap: How User Memory Alters Emotional Reasoning in LLMs
by: Fang, Xi, et al.
Published: (2025) -
Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective
by: Xu, Weijie, et al.
Published: (2025) -
A Comparative Study of Feature Selection in Tsetlin Machines
by: Halenka, Vojtech, et al.
Published: (2025) -
Benchmarking Energy Efficiency of Large Language Models Using vLLM
by: Pronk, K., et al.
Published: (2025)