Saved in:
Bibliographic Details
Main Authors: Oestreich, Julian, Müller, Lydia
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2508.15910
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909748614397952
author Oestreich, Julian
Müller, Lydia
author_facet Oestreich, Julian
Müller, Lydia
contents We present a comprehensive evaluation of structured decoding for text-to-table generation with large language models (LLMs). While previous work has primarily focused on unconstrained generation of tables, the impact of enforcing structural constraints during generation remains underexplored. We systematically compare schema-guided (structured) decoding to standard one-shot prompting across three diverse benchmarks - E2E, Rotowire, and Livesum - using open-source LLMs of up to 32B parameters, assessing the performance of table generation approaches in resource-constrained settings. Our experiments cover a wide range of evaluation metrics at cell, row, and table levels. Results demonstrate that structured decoding significantly enhances the validity and alignment of generated tables, particularly in scenarios demanding precise numerical alignment (Rotowire), but may degrade performance in contexts involving densely packed textual information (E2E) or extensive aggregation over lengthy texts (Livesum). We further analyze the suitability of different evaluation metrics and discuss the influence of model size.
format Preprint
id arxiv_https___arxiv_org_abs_2508_15910
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Evaluating Structured Decoding for Text-to-Table Generation: Evidence from Three Datasets
Oestreich, Julian
Müller, Lydia
Computation and Language
Artificial Intelligence
Information Retrieval
We present a comprehensive evaluation of structured decoding for text-to-table generation with large language models (LLMs). While previous work has primarily focused on unconstrained generation of tables, the impact of enforcing structural constraints during generation remains underexplored. We systematically compare schema-guided (structured) decoding to standard one-shot prompting across three diverse benchmarks - E2E, Rotowire, and Livesum - using open-source LLMs of up to 32B parameters, assessing the performance of table generation approaches in resource-constrained settings. Our experiments cover a wide range of evaluation metrics at cell, row, and table levels. Results demonstrate that structured decoding significantly enhances the validity and alignment of generated tables, particularly in scenarios demanding precise numerical alignment (Rotowire), but may degrade performance in contexts involving densely packed textual information (E2E) or extensive aggregation over lengthy texts (Livesum). We further analyze the suitability of different evaluation metrics and discuss the influence of model size.
title Evaluating Structured Decoding for Text-to-Table Generation: Evidence from Three Datasets
topic Computation and Language
Artificial Intelligence
Information Retrieval
url https://arxiv.org/abs/2508.15910