Saved in:
| Main Authors: | Zhang, Ran, Eger, Steffen |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.03659 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LiTransProQA: an LLM-based Literary Translation evaluation metric with Professional Question Answering
by: Zhang, Ran, et al.
Published: (2025)
by: Zhang, Ran, et al.
Published: (2025)
PromptOptMe: Error-Aware Prompt Compression for LLM-based MT Evaluation Metrics
by: Larionov, Daniil, et al.
Published: (2024)
by: Larionov, Daniil, et al.
Published: (2024)
Do Emotions Really Affect Argument Convincingness? A Dynamic Approach with LLM-based Manipulation Checks
by: Chen, Yanran, et al.
Published: (2025)
by: Chen, Yanran, et al.
Published: (2025)
Cross-lingual Cross-temporal Summarization: Dataset, Models, Evaluation
by: Zhang, Ran, et al.
Published: (2023)
by: Zhang, Ran, et al.
Published: (2023)
How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs
by: Zhang, Ran, et al.
Published: (2024)
by: Zhang, Ran, et al.
Published: (2024)
PrExMe! Large Scale Prompt Exploration of Open Source LLMs for Machine Translation and Summarization Evaluation
by: Leiter, Christoph, et al.
Published: (2024)
by: Leiter, Christoph, et al.
Published: (2024)
ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models
by: Belouadi, Jonas, et al.
Published: (2022)
by: Belouadi, Jonas, et al.
Published: (2022)
USCORE: An Effective Approach to Fully Unsupervised Evaluation Metrics for Machine Translation
by: Belouadi, Jonas, et al.
Published: (2022)
by: Belouadi, Jonas, et al.
Published: (2022)
BatchGEMBA: Token-Efficient Machine Translation Evaluation with Batched Prompting and Prompt Compression
by: Larionov, Daniil, et al.
Published: (2025)
by: Larionov, Daniil, et al.
Published: (2025)
Beyond Reproduction: A Paired-Task Framework for Assessing LLM Comprehension and Creativity in Literary Translation
by: Zhang, Ran, et al.
Published: (2026)
by: Zhang, Ran, et al.
Published: (2026)
TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning
by: Greisinger, Christian, et al.
Published: (2026)
by: Greisinger, Christian, et al.
Published: (2026)
Is there really a Citation Age Bias in NLP?
by: Nguyen, Hoa, et al.
Published: (2024)
by: Nguyen, Hoa, et al.
Published: (2024)
BMX: Boosting Natural Language Generation Metrics with Explainability
by: Leiter, Christoph, et al.
Published: (2022)
by: Leiter, Christoph, et al.
Published: (2022)
AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ
by: Belouadi, Jonas, et al.
Published: (2023)
by: Belouadi, Jonas, et al.
Published: (2023)
Graph-Guided Textual Explanation Generation Framework
by: Yuan, Shuzhou, et al.
Published: (2024)
by: Yuan, Shuzhou, et al.
Published: (2024)
LLM Analysis of 150+ years of German Parliamentary Debates on Migration Reveals Shift from Post-War Solidarity to Anti-Solidarity in the Last Decade
by: Kostikova, Aida, et al.
Published: (2025)
by: Kostikova, Aida, et al.
Published: (2025)
DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ
by: Belouadi, Jonas, et al.
Published: (2024)
by: Belouadi, Jonas, et al.
Published: (2024)
Evaluating Diversity in Automatic Poetry Generation
by: Chen, Yanran, et al.
Published: (2024)
by: Chen, Yanran, et al.
Published: (2024)
Zhyper: Factorized Hypernetworks for Conditioned LLM Fine-Tuning
by: Abdalla, M. H. I., et al.
Published: (2025)
by: Abdalla, M. H. I., et al.
Published: (2025)
DeepSeek-R1 vs. o3-mini: How Well can Reasoning LLMs Evaluate MT and Summarization?
by: Larionov, Daniil, et al.
Published: (2025)
by: Larionov, Daniil, et al.
Published: (2025)
Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph
by: Nechakhin, Vladyslav, et al.
Published: (2024)
by: Nechakhin, Vladyslav, et al.
Published: (2024)
Can LLM-Augmented autonomous agents cooperate?, An evaluation of their cooperative capabilities through Melting Pot
by: Mosquera, Manuel, et al.
Published: (2024)
by: Mosquera, Manuel, et al.
Published: (2024)
CROC: Evaluating and Training T2I Metrics with Pseudo- and Human-Labeled Contrastive Robustness Checks
by: Leiter, Christoph, et al.
Published: (2025)
by: Leiter, Christoph, et al.
Published: (2025)
xCOMET-lite: Bridging the Gap Between Efficiency and Quality in Learned MT Evaluation Metrics
by: Larionov, Daniil, et al.
Published: (2024)
by: Larionov, Daniil, et al.
Published: (2024)
Emotionally Charged, Logically Blurred: AI-driven Emotional Framing Impairs Human Fallacy Detection
by: Chen, Yanran, et al.
Published: (2025)
by: Chen, Yanran, et al.
Published: (2025)
GerAV: Towards New Heights in German Authorship Verification using Fine-Tuned LLMs on a New Benchmark
by: Kiefer, Lotta, et al.
Published: (2026)
by: Kiefer, Lotta, et al.
Published: (2026)
NLLG Quarterly arXiv Report 09/24: What are the most influential current AI Papers?
by: Leiter, Christoph, et al.
Published: (2024)
by: Leiter, Christoph, et al.
Published: (2024)
Towards Explainable Evaluation Metrics for Machine Translation
by: Leiter, Christoph, et al.
Published: (2023)
by: Leiter, Christoph, et al.
Published: (2023)
Syntactic Language Change in English and German: Metrics, Parsers, and Convergences
by: Chen, Yanran, et al.
Published: (2024)
by: Chen, Yanran, et al.
Published: (2024)
Argument Summarization and its Evaluation in the Era of Large Language Models
by: Altemeyer, Moritz, et al.
Published: (2025)
by: Altemeyer, Moritz, et al.
Published: (2025)
ValueGround: Evaluating Culture-Conditioned Visual Value Grounding in MLLMs
by: Wang, Zhipin, et al.
Published: (2026)
by: Wang, Zhipin, et al.
Published: (2026)
MARS: toward more efficient multi-agent collaboration for LLM reasoning
by: Wang, Xiao, et al.
Published: (2025)
by: Wang, Xiao, et al.
Published: (2025)
ContrastScore: Towards Higher Quality, Less Biased, More Efficient Evaluation Metrics with Contrastive Evaluation
by: Wang, Xiao, et al.
Published: (2025)
by: Wang, Xiao, et al.
Published: (2025)
LLLMs: A Data-Driven Survey of Evolving Research on Limitations of Large Language Models
by: Kostikova, Aida, et al.
Published: (2025)
by: Kostikova, Aida, et al.
Published: (2025)
Fine-Grained Detection of Solidarity for Women and Migrants in 155 Years of German Parliamentary Debates
by: Kostikova, Aida, et al.
Published: (2022)
by: Kostikova, Aida, et al.
Published: (2022)
AIstorian lets AI be a historian: A KG-powered multi-agent system for accurate biography generation
by: Li, Fengyu, et al.
Published: (2025)
by: Li, Fengyu, et al.
Published: (2025)
Research on Tibetan Tourism Viewpoints information generation system based on LLM
by: Qi, Jinhu, et al.
Published: (2024)
by: Qi, Jinhu, et al.
Published: (2024)
The author is dead, but what if they never lived? A reception experiment on Czech AI- and human-authored poetry
by: Marklová, Anna, et al.
Published: (2025)
by: Marklová, Anna, et al.
Published: (2025)
TikZero: Zero-Shot Text-Guided Graphics Program Synthesis
by: Belouadi, Jonas, et al.
Published: (2025)
by: Belouadi, Jonas, et al.
Published: (2025)
Inherent and emergent liability issues in LLM-based agentic systems: a principal-agent perspective
by: Gabison, Garry A., et al.
Published: (2025)
by: Gabison, Garry A., et al.
Published: (2025)
Similar Items
-
LiTransProQA: an LLM-based Literary Translation evaluation metric with Professional Question Answering
by: Zhang, Ran, et al.
Published: (2025) -
PromptOptMe: Error-Aware Prompt Compression for LLM-based MT Evaluation Metrics
by: Larionov, Daniil, et al.
Published: (2024) -
Do Emotions Really Affect Argument Convincingness? A Dynamic Approach with LLM-based Manipulation Checks
by: Chen, Yanran, et al.
Published: (2025) -
Cross-lingual Cross-temporal Summarization: Dataset, Models, Evaluation
by: Zhang, Ran, et al.
Published: (2023) -
How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs
by: Zhang, Ran, et al.
Published: (2024)