Saved in:
| Main Authors: | Wang, Yang, Xiao, Chenghao, Hsiao, Chia-Yi, Chang, Zi Yan, Chen, Chi-Li, Loakman, Tyler, Lin, Chenghua |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.03867 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Seeing isn't Hearing: Benchmarking Vision Language Models at Interpreting Spectrograms
by: Loakman, Tyler, et al.
Published: (2025)
by: Loakman, Tyler, et al.
Published: (2025)
ReproHum #0087-01: Human Evaluation Reproduction Report for Generating Fact Checking Explanations
by: Loakman, Tyler, et al.
Published: (2024)
by: Loakman, Tyler, et al.
Published: (2024)
Train & Constrain: Phonologically Informed Tongue-Twister Generation from Topics and Paraphrases
by: Loakman, Tyler, et al.
Published: (2024)
by: Loakman, Tyler, et al.
Published: (2024)
With Ears to See and Eyes to Hear: Sound Symbolism Experiments with Multimodal Large Language Models
by: Loakman, Tyler, et al.
Published: (2024)
by: Loakman, Tyler, et al.
Published: (2024)
Who's Laughing Now? An Overview of Computational Humour Generation and Explanation
by: Loakman, Tyler, et al.
Published: (2025)
by: Loakman, Tyler, et al.
Published: (2025)
Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes
by: Loakman, Tyler, et al.
Published: (2025)
by: Loakman, Tyler, et al.
Published: (2025)
Exploring Task Performance with Interpretable Models via Sparse Auto-Encoders
by: Wang, Shun, et al.
Published: (2025)
by: Wang, Shun, et al.
Published: (2025)
CADGE: Context-Aware Dialogue Generation Enhanced with Graph-Structured Knowledge Aggregation
by: Zhang, Hongbo, et al.
Published: (2023)
by: Zhang, Hongbo, et al.
Published: (2023)
MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language
by: Wang, Shun, et al.
Published: (2024)
by: Wang, Shun, et al.
Published: (2024)
Effective Distillation of Table-based Reasoning Ability from LLMs
by: Yang, Bohao, et al.
Published: (2023)
by: Yang, Bohao, et al.
Published: (2023)
On the Rigour of Scientific Writing: Criteria, Analysis, and Insights
by: James, Joseph, et al.
Published: (2024)
by: James, Joseph, et al.
Published: (2024)
Crafting Customisable Characters with LLMs: A Persona-Driven Role-Playing Agent Framework
by: Yang, Bohao, et al.
Published: (2024)
by: Yang, Bohao, et al.
Published: (2024)
LongEval: A Comprehensive Analysis of Long-Text Generation Through a Plan-based Paradigm
by: Wu, Siwei, et al.
Published: (2025)
by: Wu, Siwei, et al.
Published: (2025)
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts
by: Hong, Hanhua, et al.
Published: (2025)
by: Hong, Hanhua, et al.
Published: (2025)
RIGOURATE: Quantifying Scientific Exaggeration with Evidence-Aligned Claim Evaluation
by: James, Joseph, et al.
Published: (2026)
by: James, Joseph, et al.
Published: (2026)
Adversarial Defence without Adversarial Defence: Enhancing Language Model Robustness via Instance-level Principal Component Removal
by: Wang, Yang, et al.
Published: (2025)
by: Wang, Yang, et al.
Published: (2025)
BioMNER: A Dataset for Biomedical Method Entity Recognition
by: Tang, Chen, et al.
Published: (2024)
by: Tang, Chen, et al.
Published: (2024)
Audio Contrastive-based Fine-tuning: Decoupling Representation Learning and Classification
by: Wang, Yang, et al.
Published: (2023)
by: Wang, Yang, et al.
Published: (2023)
From Facts to Insights: A Study on the Generation and Evaluation of Analytical Reports for Deciphering Earnings Calls
by: Goldsack, Tomas, et al.
Published: (2024)
by: Goldsack, Tomas, et al.
Published: (2024)
Finding Challenging Metaphors that Confuse Pretrained Language Models
by: Li, Yucheng, et al.
Published: (2024)
by: Li, Yucheng, et al.
Published: (2024)
Tougher Text, Smarter Models: Raising the Bar for Adversarial Defence Benchmarks
by: Wang, Yang, et al.
Published: (2025)
by: Wang, Yang, et al.
Published: (2025)
In Defense of "Ignorant Drivel".
by: Boyce, Bert R., et al.
Published: (1987)
by: Boyce, Bert R., et al.
Published: (1987)
Overview of the NLPCC 2025 Shared Task: Gender Bias Mitigation Challenge
by: Li, Yizhi, et al.
Published: (2025)
by: Li, Yizhi, et al.
Published: (2025)
Equipping Transformer with Random-Access Reading for Long-Context Understanding
by: Yang, Chenghao, et al.
Published: (2024)
by: Yang, Chenghao, et al.
Published: (2024)
X-ray Made Simple: Lay Radiology Report Generation and Robust Evaluation
by: Zhao, Kun, et al.
Published: (2024)
by: Zhao, Kun, et al.
Published: (2024)
LLMs as Narcissistic Evaluators: When Ego Inflates Evaluation Scores
by: Liu, Yiqi, et al.
Published: (2023)
by: Liu, Yiqi, et al.
Published: (2023)
Unlearning in LLMs: Methods, Evaluation, and Open Challenges
by: Lizzo, Tyler, et al.
Published: (2026)
by: Lizzo, Tyler, et al.
Published: (2026)
The(y)ology
by: Brumberg-Kraus, Max Yeshaye
Published: (2023)
by: Brumberg-Kraus, Max Yeshaye
Published: (2023)
Emphasising Structured Information: Integrating Abstract Meaning Representation into LLMs for Enhanced Open-Domain Dialogue Evaluation
by: Yang, Bohao, et al.
Published: (2024)
by: Yang, Bohao, et al.
Published: (2024)
CAST: Corpus-Aware Self-similarity Enhanced Topic modelling
by: Ma, Yanan, et al.
Published: (2024)
by: Ma, Yanan, et al.
Published: (2024)
Benchmarking for Domain-Specific LLMs: A Case Study on Academia and Beyond
by: Chen, Rubing, et al.
Published: (2025)
by: Chen, Rubing, et al.
Published: (2025)
Nonsense Helps: Prompt Space Perturbation Broadens Reasoning Exploration
by: Huang, Langlin, et al.
Published: (2026)
by: Huang, Langlin, et al.
Published: (2026)
The Achilles' Heel of Angular Margins: A Chebyshev Polynomial Fix for Speaker Verification
by: Wang, Yang, et al.
Published: (2026)
by: Wang, Yang, et al.
Published: (2026)
Prompt-Induced Linguistic Fingerprints for LLM-Generated Fake News Detection
by: Wang, Chi, et al.
Published: (2025)
by: Wang, Chi, et al.
Published: (2025)
Self-Evolved Reward Learning for LLMs
by: Huang, Chenghua, et al.
Published: (2024)
by: Huang, Chenghua, et al.
Published: (2024)
Observing Micromotives and Macrobehavior of Large Language Models
by: Cheng, Yuyang, et al.
Published: (2024)
by: Cheng, Yuyang, et al.
Published: (2024)
Natural Language Generation
by: van Miltenburg, Emiel, et al.
Published: (2025)
by: van Miltenburg, Emiel, et al.
Published: (2025)
An Open Source Data Contamination Report for Large Language Models
by: Li, Yucheng, et al.
Published: (2023)
by: Li, Yucheng, et al.
Published: (2023)
LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time-Sensitive Test Construction
by: Li, Yucheng, et al.
Published: (2023)
by: Li, Yucheng, et al.
Published: (2023)
Quantifier Scope Interpretation in Language Learners and LLMs
by: Fang, Shaohua, et al.
Published: (2025)
by: Fang, Shaohua, et al.
Published: (2025)
Similar Items
-
Seeing isn't Hearing: Benchmarking Vision Language Models at Interpreting Spectrograms
by: Loakman, Tyler, et al.
Published: (2025) -
ReproHum #0087-01: Human Evaluation Reproduction Report for Generating Fact Checking Explanations
by: Loakman, Tyler, et al.
Published: (2024) -
Train & Constrain: Phonologically Informed Tongue-Twister Generation from Topics and Paraphrases
by: Loakman, Tyler, et al.
Published: (2024) -
With Ears to See and Eyes to Hear: Sound Symbolism Experiments with Multimodal Large Language Models
by: Loakman, Tyler, et al.
Published: (2024) -
Who's Laughing Now? An Overview of Computational Humour Generation and Explanation
by: Loakman, Tyler, et al.
Published: (2025)