Saved in:
| Main Authors: | Sun, Jing Han, Emami, Ali |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.13372 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
WSC+: Enhancing The Winograd Schema Challenge Using Tree-of-Experts
by: Zahraei, Pardis Sadat, et al.
Published: (2024)
by: Zahraei, Pardis Sadat, et al.
Published: (2024)
Picturing Ambiguity: A Visual Twist on the Winograd Schema Challenge
by: Park, Brendan, et al.
Published: (2024)
by: Park, Brendan, et al.
Published: (2024)
EvoGrad: Metaheuristics in a Differentiable Wonderland
by: Citterio, Beatrice F. R., et al.
Published: (2025)
by: Citterio, Beatrice F. R., et al.
Published: (2025)
Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning
by: Artkaew, Phakphum
Published: (2024)
by: Artkaew, Phakphum
Published: (2024)
Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction
by: Han, Kaiqiao, et al.
Published: (2024)
by: Han, Kaiqiao, et al.
Published: (2024)
Solving the Challenge Set without Solving the Task: On Winograd Schemas as a Test of Pronominal Coreference Resolution
by: Porada, Ian, et al.
Published: (2024)
by: Porada, Ian, et al.
Published: (2024)
EvoSchema: Towards Text-to-SQL Robustness Against Schema Evolution
by: Zhang, Tianshu, et al.
Published: (2026)
by: Zhang, Tianshu, et al.
Published: (2026)
Memory Dial: A Training Framework for Controllable Memorization in Language Models
by: Zhang, Xiangbo, et al.
Published: (2026)
by: Zhang, Xiangbo, et al.
Published: (2026)
Translate With Care: Addressing Gender Bias, Neutrality, and Reasoning in Large Language Model Translations
by: Zahraei, Pardis Sadat, et al.
Published: (2025)
by: Zahraei, Pardis Sadat, et al.
Published: (2025)
Trace-of-Thought Prompting: Investigating Prompt-Based Knowledge Distillation Through Question Decomposition
by: McDonald, Tyler, et al.
Published: (2025)
by: McDonald, Tyler, et al.
Published: (2025)
Let LLMs Take on the Latest Challenges! A Chinese Dynamic Question Answering Benchmark
by: Xu, Zhikun, et al.
Published: (2024)
by: Xu, Zhikun, et al.
Published: (2024)
MirrorStories: Reflecting Diversity through Personalized Narrative Generation with Large Language Models
by: Yunusov, Sarfaroz, et al.
Published: (2024)
by: Yunusov, Sarfaroz, et al.
Published: (2024)
The Dog the Cat Chased Stumped the Model: Measuring When Language Models Abandon Structure for Shortcuts
by: Madhusudan, Sangmitra, et al.
Published: (2025)
by: Madhusudan, Sangmitra, et al.
Published: (2025)
Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating Representative and Affinity Bias in Large Language Models
by: Kumar, Abhishek, et al.
Published: (2024)
by: Kumar, Abhishek, et al.
Published: (2024)
TALE: A Tool-Augmented Framework for Reference-Free Evaluation of Large Language Models
by: Badshah, Sher, et al.
Published: (2025)
by: Badshah, Sher, et al.
Published: (2025)
DART: Mitigating Harm Drift in Difference-Aware LLMs via Distill-Audit-Repair Training
by: Pan, Ziwen, et al.
Published: (2026)
by: Pan, Ziwen, et al.
Published: (2026)
LLMs4SchemaDiscovery: A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Models
by: Sadruddin, Sameer, et al.
Published: (2025)
by: Sadruddin, Sameer, et al.
Published: (2025)
GradShield: Alignment Preserving Finetuning
by: Hu, Zhanhao, et al.
Published: (2026)
by: Hu, Zhanhao, et al.
Published: (2026)
Can We Afford The Perfect Prompt? Balancing Cost and Accuracy with the Economical Prompting Index
by: McDonald, Tyler, et al.
Published: (2024)
by: McDonald, Tyler, et al.
Published: (2024)
Which Words Matter Most in Zero-Shot Prompts?
by: Sadr, Nikta Gohari, et al.
Published: (2025)
by: Sadr, Nikta Gohari, et al.
Published: (2025)
AutoSchemaKG: Autonomous Knowledge Graph Construction through Dynamic Schema Induction from Web-Scale Corpora
by: Bai, Jiaxin, et al.
Published: (2025)
by: Bai, Jiaxin, et al.
Published: (2025)
NYT-Connections: A Deceptively Simple Text Classification Task that Stumps System-1 Thinkers
by: Lopez, Angel Yahir Loredo, et al.
Published: (2024)
by: Lopez, Angel Yahir Loredo, et al.
Published: (2024)
EvoGens: A Population-Based Heuristic Search Framework for Scientific Idea Generation
by: Li, Xu, et al.
Published: (2026)
by: Li, Xu, et al.
Published: (2026)
Personality Matters: User Traits Predict LLM Preferences in Multi-Turn Collaborative Tasks
by: Yunusov, Sarfaroz, et al.
Published: (2025)
by: Yunusov, Sarfaroz, et al.
Published: (2025)
Zero-Shot Open-Schema Entity Structure Discovery
by: Xu, Xueqiang, et al.
Published: (2025)
by: Xu, Xueqiang, et al.
Published: (2025)
Encoding Hierarchical Schema via Concept Flow for Multifaceted Ideology Detection
by: Liu, Songtao, et al.
Published: (2024)
by: Liu, Songtao, et al.
Published: (2024)
Schema-Aware Multi-Task Learning for Complex Text-to-SQL
by: Wu, Yangjun, et al.
Published: (2024)
by: Wu, Yangjun, et al.
Published: (2024)
STOP! Benchmarking Large Language Models with Sensitivity Testing on Offensive Progressions
by: Morabito, Robert, et al.
Published: (2024)
by: Morabito, Robert, et al.
Published: (2024)
Schema for In-Context Learning
by: Chen, Pan, et al.
Published: (2025)
by: Chen, Pan, et al.
Published: (2025)
EvoLM: In Search of Lost Language Model Training Dynamics
by: Qi, Zhenting, et al.
Published: (2025)
by: Qi, Zhenting, et al.
Published: (2025)
DBCopilot: Natural Language Querying over Massive Databases via Schema Routing
by: Wang, Tianshu, et al.
Published: (2023)
by: Wang, Tianshu, et al.
Published: (2023)
Don't Adapt Small Language Models for Tools; Adapt Tool Schemas to the Models
by: Lee, Jonggeun, et al.
Published: (2025)
by: Lee, Jonggeun, et al.
Published: (2025)
EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models
by: Jing, Linglin, et al.
Published: (2025)
by: Jing, Linglin, et al.
Published: (2025)
TextualVerifier: Verify TextGrad Step-by-Step
by: Situmorang, Eugenius Mario, et al.
Published: (2025)
by: Situmorang, Eugenius Mario, et al.
Published: (2025)
EvoWiki: Evaluating LLMs on Evolving Knowledge
by: Tang, Wei, et al.
Published: (2024)
by: Tang, Wei, et al.
Published: (2024)
AutoLink: Autonomous Schema Exploration and Expansion for Scalable Schema Linking in Text-to-SQL at Scale
by: Wang, Ziyang, et al.
Published: (2025)
by: Wang, Ziyang, et al.
Published: (2025)
Fine-Tuned LLMs are "Time Capsules" for Tracking Societal Bias Through Books
by: Madhusudan, Sangmitra, et al.
Published: (2025)
by: Madhusudan, Sangmitra, et al.
Published: (2025)
metaTextGrad: Automatically optimizing language model optimizers
by: Xu, Guowei, et al.
Published: (2025)
by: Xu, Guowei, et al.
Published: (2025)
KCoEvo: A Knowledge Graph Augmented Framework for Evolutionary Code Generation
by: Kang, Jiazhen, et al.
Published: (2026)
by: Kang, Jiazhen, et al.
Published: (2026)
Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance
by: Wang, Xixi, et al.
Published: (2025)
by: Wang, Xixi, et al.
Published: (2025)
Similar Items
-
WSC+: Enhancing The Winograd Schema Challenge Using Tree-of-Experts
by: Zahraei, Pardis Sadat, et al.
Published: (2024) -
Picturing Ambiguity: A Visual Twist on the Winograd Schema Challenge
by: Park, Brendan, et al.
Published: (2024) -
EvoGrad: Metaheuristics in a Differentiable Wonderland
by: Citterio, Beatrice F. R., et al.
Published: (2025) -
Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning
by: Artkaew, Phakphum
Published: (2024) -
Concept-Reversed Winograd Schema Challenge: Evaluating and Improving Robust Reasoning in Large Language Models via Abstraction
by: Han, Kaiqiao, et al.
Published: (2024)