:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Goldsack, Tomas, Wang, Yang, Lin, Chenghua, Chen, Chung-Chi
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2410.01039
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Leveraging Large Language Models for Zero-shot Lay Summarisation in Biomedicine and Beyond
by: Goldsack, Tomas, et al.
Published: (2025)

Observing Micromotives and Macrobehavior of Large Language Models
by: Cheng, Yuyang, et al.
Published: (2024)

ATLAS: Improving Lay Summarisation with Attribute-based Control
by: Zhang, Zhihao, et al.
Published: (2024)

Overview of the BioLaySumm 2024 Shared Task on the Lay Summarization of Biomedical Research Articles
by: Goldsack, Tomas, et al.
Published: (2024)

Co-Trained Retriever-Generator Framework for Question Generation in Earnings Calls
by: Juan, Yining, et al.
Published: (2024)

ReproHum #0087-01: Human Evaluation Reproduction Report for Generating Fact Checking Explanations
by: Loakman, Tyler, et al.
Published: (2024)

EvasionBench: A Large-Scale Benchmark for Detecting Managerial Evasion in Earnings Call Q&A
by: Ma, Shijian, et al.
Published: (2026)

Evaluating Large Language Models for Stance Detection on Financial Targets from SEC Filing Reports and Earnings Call Transcripts
by: Gyawali, Nikesh, et al.
Published: (2025)

Tougher Text, Smarter Models: Raising the Bar for Adversarial Defence Benchmarks
by: Wang, Yang, et al.
Published: (2025)

Same Company, Same Signal: The Role of Identity in Earnings Call Transcripts
by: Yu, Ding, et al.
Published: (2024)

Prompt-Induced Linguistic Fingerprints for LLM-Generated Fake News Detection
by: Wang, Chi, et al.
Published: (2025)

Natural Language Generation
by: van Miltenburg, Emiel, et al.
Published: (2025)

Train & Constrain: Phonologically Informed Tongue-Twister Generation from Topics and Paraphrases
by: Loakman, Tyler, et al.
Published: (2024)

SLIDE: A Framework Integrating Small and Large Language Models for Open-Domain Dialogues Evaluation
by: Zhao, Kun, et al.
Published: (2024)

On the Rigour of Scientific Writing: Criteria, Analysis, and Insights
by: James, Joseph, et al.
Published: (2024)

LLMs as Narcissistic Evaluators: When Ego Inflates Evaluation Scores
by: Liu, Yiqi, et al.
Published: (2023)

Effective Performance Measurement: Challenges and Opportunities in KPI Extraction from Earnings Calls
by: Aavang, Rasmus T., et al.
Published: (2026)

X-ray Made Simple: Lay Radiology Report Generation and Robust Evaluation
by: Zhao, Kun, et al.
Published: (2024)

Evaluating Large Language Models for Generalization and Robustness via Data Compression
by: Li, Yucheng, et al.
Published: (2024)

An Open Source Data Contamination Report for Large Language Models
by: Li, Yucheng, et al.
Published: (2023)

Ara-HOPE: Human-Centric Post-Editing Evaluation for Dialectal Arabic to Modern Standard Arabic Translation
by: Alabdullah, Abdullah, et al.
Published: (2025)

Who's Laughing Now? An Overview of Computational Humour Generation and Explanation
by: Loakman, Tyler, et al.
Published: (2025)

Emphasising Structured Information: Integrating Abstract Meaning Representation into LLMs for Enhanced Open-Domain Dialogue Evaluation
by: Yang, Bohao, et al.
Published: (2024)

Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts
by: Hong, Hanhua, et al.
Published: (2025)

Advanced Deep Learning Techniques for Analyzing Earnings Call Transcripts: Methodologies and Applications
by: Zakir, Umair, et al.
Published: (2025)

LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time-Sensitive Test Construction
by: Li, Yucheng, et al.
Published: (2023)

MedFactEval and MedAgentBrief: A Framework and Workflow for Generating and Evaluating Factual Clinical Summaries
by: Grolleau, François, et al.
Published: (2025)

CADGE: Context-Aware Dialogue Generation Enhanced with Graph-Structured Knowledge Aggregation
by: Zhang, Hongbo, et al.
Published: (2023)

Instruction-Guided Bullet Point Summarization of Long Financial Earnings Call Transcripts
by: Khatuya, Subhendu, et al.
Published: (2024)

VeriFact: Enhancing Long-Form Factuality Evaluation with Refined Fact Extraction and Reference Facts
by: Liu, Xin, et al.
Published: (2025)

Evaluating Large Language Models as Expert Annotators
by: Tseng, Yu-Min, et al.
Published: (2025)

Decision-Oriented Text Evaluation
by: Huang, Yu-Shiang, et al.
Published: (2025)

Language Model as an Annotator: Unsupervised Context-aware Quality Phrase Generation
by: Zhang, Zhihao, et al.
Published: (2023)

MiMIC: Multi-Modal Indian Earnings Calls Dataset to Predict Stock Prices
by: Ghosh, Sohom, et al.
Published: (2025)

Beyond Translation: LLM-Based Data Generation for Multilingual Fact-Checking
by: Chung, Yi-Ling, et al.
Published: (2025)

Evaluating LLM-Based Grant Proposal Review via Structured Perturbations
by: Thorne, William, et al.
Published: (2026)

Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
by: Wang, Yang, et al.
Published: (2025)

Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes
by: Loakman, Tyler, et al.
Published: (2025)

DRE: An Effective Dual-Refined Method for Integrating Small and Large Language Models in Open-Domain Dialogue Evaluation
by: Zhao, Kun, et al.
Published: (2025)

MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language
by: Wang, Shun, et al.
Published: (2024)