Saved in:
| Main Authors: | Shankar, Shreya, Li, Haotian, Asawa, Parth, Hulsebos, Madelon, Lin, Yiming, Zamfirescu-Pereira, J. D., Chase, Harrison, Fu-Hinthorn, Will, Parameswaran, Aditya G., Wu, Eugene |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.03038 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines
by: Vir, Reya, et al.
Published: (2025)
by: Vir, Reya, et al.
Published: (2025)
Towards Accurate and Efficient Document Analytics with Large Language Models
by: Lin, Yiming, et al.
Published: (2024)
by: Lin, Yiming, et al.
Published: (2024)
TARGET: Benchmarking Table Retrieval for Generative Tasks
by: Ji, Xingyu, et al.
Published: (2025)
by: Ji, Xingyu, et al.
Published: (2025)
LLM-Powered Proactive Data Systems
by: Zeighami, Sepanta, et al.
Published: (2025)
by: Zeighami, Sepanta, et al.
Published: (2025)
Task Cascades for Efficient Unstructured Data Processing
by: Shankar, Shreya, et al.
Published: (2026)
by: Shankar, Shreya, et al.
Published: (2026)
Featurized-Decomposition Join: Low-Cost Semantic Joins with Guarantees
by: Zeighami, Sepanta, et al.
Published: (2025)
by: Zeighami, Sepanta, et al.
Published: (2025)
Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees
by: Zeighami, Sepanta, et al.
Published: (2025)
by: Zeighami, Sepanta, et al.
Published: (2025)
Towards Contextual Sensitive Data Detection
by: Telkamp, Liang, et al.
Published: (2025)
by: Telkamp, Liang, et al.
Published: (2025)
DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
by: Shankar, Shreya, et al.
Published: (2024)
by: Shankar, Shreya, et al.
Published: (2024)
Are We Asking the Right Questions? On Ambiguity in Natural Language Queries for Tabular Data Analysis
by: Gomm, Daniel, et al.
Published: (2025)
by: Gomm, Daniel, et al.
Published: (2025)
Rethinking Dataset Discovery with DataScout
by: Lin, Rachel, et al.
Published: (2025)
by: Lin, Rachel, et al.
Published: (2025)
Flow with FlorDB: Incremental Context Maintenance for the Machine Learning Lifecycle
by: Garcia, Rolando, et al.
Published: (2024)
by: Garcia, Rolando, et al.
Published: (2024)
Testing Database Systems with Large Language Model Synthesized Fragments
by: Zhong, Suyang, et al.
Published: (2025)
by: Zhong, Suyang, et al.
Published: (2025)
Semantic Data Processing with Holistic Data Understanding
by: Sun, Youran, et al.
Published: (2026)
by: Sun, Youran, et al.
Published: (2026)
Observatory: Characterizing Embeddings of Relational Tables
by: Cong, Tianji, et al.
Published: (2023)
by: Cong, Tianji, et al.
Published: (2023)
Fine-Grained Table Retrieval Through the Lens of Complex Queries
by: Kosiuk, Wojciech, et al.
Published: (2026)
by: Kosiuk, Wojciech, et al.
Published: (2026)
DIRT: Database-Integrated Random Testing
by: Keles, Alperen, et al.
Published: (2026)
by: Keles, Alperen, et al.
Published: (2026)
Multi-Objective Agentic Rewrites for Unstructured Data Processing
by: Wei, Lindsey Linxi, et al.
Published: (2025)
by: Wei, Lindsey Linxi, et al.
Published: (2025)
Prompt Migration: Stabilizing GenAI Applications with Evolving Large Language Models
by: Tripathi, Shivani, et al.
Published: (2025)
by: Tripathi, Shivani, et al.
Published: (2025)
CUBES: A Parallel Synthesizer for SQL Using Examples
by: Brancas, Ricardo, et al.
Published: (2022)
by: Brancas, Ricardo, et al.
Published: (2022)
Synthesizing Document Database Queries using Collection Abstractions
by: Liu, Qikang, et al.
Published: (2024)
by: Liu, Qikang, et al.
Published: (2024)
Steering Semantic Data Processing With DocWrangler
by: Shankar, Shreya, et al.
Published: (2025)
by: Shankar, Shreya, et al.
Published: (2025)
AI-Assisted SQL Authoring at Industry Scale
by: Maddila, Chandra, et al.
Published: (2024)
by: Maddila, Chandra, et al.
Published: (2024)
Quality Assessment of Tabular Data using Large Language Models and Code Generation
by: Akella, Ashlesha, et al.
Published: (2025)
by: Akella, Ashlesha, et al.
Published: (2025)
Can AI Agents Answer Your Data Questions? A Benchmark for Data Agents
by: Ma, Ruiying, et al.
Published: (2026)
by: Ma, Ruiying, et al.
Published: (2026)
The Time is Here for Just-in-Time Systems: Challenges and Opportunities
by: Liu, Shu, et al.
Published: (2026)
by: Liu, Shu, et al.
Published: (2026)
Process Modeling With Large Language Models
by: Kourani, Humam, et al.
Published: (2024)
by: Kourani, Humam, et al.
Published: (2024)
Object Graph Programming
by: Thimmaiah, Aditya, et al.
Published: (2024)
by: Thimmaiah, Aditya, et al.
Published: (2024)
Enhancing LLM Fine-tuning for Text-to-SQLs by SQL Quality Measurement
by: Sarker, Shouvon, et al.
Published: (2024)
by: Sarker, Shouvon, et al.
Published: (2024)
Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences
by: Shankar, Shreya, et al.
Published: (2024)
by: Shankar, Shreya, et al.
Published: (2024)
Adaptive Data Quality Scoring Operations Framework using Drift-Aware Mechanism for Industrial Applications
by: Bayram, Firas, et al.
Published: (2024)
by: Bayram, Firas, et al.
Published: (2024)
Liberal Entity Matching as a Compound AI Toolchain
by: Fu, Silvery D., et al.
Published: (2024)
by: Fu, Silvery D., et al.
Published: (2024)
Automated Tensor-Relational Decomposition for Large-Scale Sparse Tensor Computation
by: Tang, Yuxin, et al.
Published: (2026)
by: Tang, Yuxin, et al.
Published: (2026)
How well do LLMs reason over tabular data, really?
by: Wolff, Cornelius, et al.
Published: (2025)
by: Wolff, Cornelius, et al.
Published: (2025)
GEE-OPs: An Operator Knowledge Base for Geospatial Code Generation on the Google Earth Engine Platform Powered by Large Language Models
by: Hou, Shuyang, et al.
Published: (2024)
by: Hou, Shuyang, et al.
Published: (2024)
AssertionBench: A Benchmark to Evaluate Large-Language Models for Assertion Generation
by: Pulavarthi, Vaishnavi, et al.
Published: (2024)
by: Pulavarthi, Vaishnavi, et al.
Published: (2024)
Mining Constraints from Reference Process Models for Detecting Best-Practice Violations in Event Logs
by: Rebmann, Adrian, et al.
Published: (2024)
by: Rebmann, Adrian, et al.
Published: (2024)
Dinkel: State-Aware and Granular Framework for Validating Graph Databases
by: Wüst, Celine, et al.
Published: (2024)
by: Wüst, Celine, et al.
Published: (2024)
Vextra: A Unified Middleware Abstraction for Heterogeneous Vector Database Systems
by: Suri, Chandan, et al.
Published: (2026)
by: Suri, Chandan, et al.
Published: (2026)
A Practical Framework for Flaky Failure Triage in Distributed Database Continuous Integration
by: Zhu, Jun-Peng, et al.
Published: (2026)
by: Zhu, Jun-Peng, et al.
Published: (2026)
Similar Items
-
PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines
by: Vir, Reya, et al.
Published: (2025) -
Towards Accurate and Efficient Document Analytics with Large Language Models
by: Lin, Yiming, et al.
Published: (2024) -
TARGET: Benchmarking Table Retrieval for Generative Tasks
by: Ji, Xingyu, et al.
Published: (2025) -
LLM-Powered Proactive Data Systems
by: Zeighami, Sepanta, et al.
Published: (2025) -
Task Cascades for Efficient Unstructured Data Processing
by: Shankar, Shreya, et al.
Published: (2026)