Saved in:
| Main Authors: | Agarwal, Shubham, Biswal, Asim, Zeighami, Sepanta, Cheung, Alvin, Gonzalez, Joseph, Parameswaran, Aditya G. |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.13521 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees
by: Zeighami, Sepanta, et al.
Published: (2025)
by: Zeighami, Sepanta, et al.
Published: (2025)
LLM-Powered Proactive Data Systems
by: Zeighami, Sepanta, et al.
Published: (2025)
by: Zeighami, Sepanta, et al.
Published: (2025)
Supporting Our AI Overlords: Redesigning Data Systems to be Agent-First
by: Liu, Shu, et al.
Published: (2025)
by: Liu, Shu, et al.
Published: (2025)
Task Cascades for Efficient Unstructured Data Processing
by: Shankar, Shreya, et al.
Published: (2026)
by: Shankar, Shreya, et al.
Published: (2026)
Featurized-Decomposition Join: Low-Cost Semantic Joins with Guarantees
by: Zeighami, Sepanta, et al.
Published: (2025)
by: Zeighami, Sepanta, et al.
Published: (2025)
NUDGE: Lightweight Non-Parametric Fine-Tuning of Embeddings for Retrieval
by: Zeighami, Sepanta, et al.
Published: (2024)
by: Zeighami, Sepanta, et al.
Published: (2024)
Semantic Data Processing with Holistic Data Understanding
by: Sun, Youran, et al.
Published: (2026)
by: Sun, Youran, et al.
Published: (2026)
RAG Without the Lag: Interactive Debugging for Retrieval-Augmented Generation Pipelines
by: Lauro, Quentin Romero, et al.
Published: (2025)
by: Lauro, Quentin Romero, et al.
Published: (2025)
Multi-Objective Agentic Rewrites for Unstructured Data Processing
by: Wei, Lindsey Linxi, et al.
Published: (2025)
by: Wei, Lindsey Linxi, et al.
Published: (2025)
BiasBuster: a Neural Approach for Accurate Estimation of Population Statistics using Biased Location Data
by: Zeighami, Sepanta, et al.
Published: (2024)
by: Zeighami, Sepanta, et al.
Published: (2024)
Can AI Agents Answer Your Data Questions? A Benchmark for Data Agents
by: Ma, Ruiying, et al.
Published: (2026)
by: Ma, Ruiying, et al.
Published: (2026)
Theoretical Analysis of Learned Database Operations under Distribution Shift through Distribution Learnability
by: Zeighami, Sepanta, et al.
Published: (2024)
by: Zeighami, Sepanta, et al.
Published: (2024)
Towards Establishing Guaranteed Error for Learned Database Operations
by: Zeighami, Sepanta, et al.
Published: (2024)
by: Zeighami, Sepanta, et al.
Published: (2024)
Text2SQL is Not Enough: Unifying AI and Databases with TAG
by: Biswal, Asim, et al.
Published: (2024)
by: Biswal, Asim, et al.
Published: (2024)
DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
by: Shankar, Shreya, et al.
Published: (2024)
by: Shankar, Shreya, et al.
Published: (2024)
AgentSM: Semantic Memory for Agentic Text-to-SQL
by: Biswal, Asim, et al.
Published: (2026)
by: Biswal, Asim, et al.
Published: (2026)
TWIX: Automatically Reconstructing Structured Data from Templatized Documents
by: Lin, Yiming, et al.
Published: (2025)
by: Lin, Yiming, et al.
Published: (2025)
TARGET: Benchmarking Table Retrieval for Generative Tasks
by: Ji, Xingyu, et al.
Published: (2025)
by: Ji, Xingyu, et al.
Published: (2025)
The Time is Here for Just-in-Time Systems: Challenges and Opportunities
by: Liu, Shu, et al.
Published: (2026)
by: Liu, Shu, et al.
Published: (2026)
A Multi-Agent System for Semantic Mapping of Relational Data to Knowledge Graphs
by: Trajanoska, Milena, et al.
Published: (2025)
by: Trajanoska, Milena, et al.
Published: (2025)
Towards Accurate and Efficient Document Analytics with Large Language Models
by: Lin, Yiming, et al.
Published: (2024)
by: Lin, Yiming, et al.
Published: (2024)
Grid-Based Projection of Spatial Data into Knowledge Graphs
by: Anjomshoaa, Amin, et al.
Published: (2024)
by: Anjomshoaa, Amin, et al.
Published: (2024)
LLM/Agent-as-Data-Analyst: A Survey
by: Tang, Zirui, et al.
Published: (2025)
by: Tang, Zirui, et al.
Published: (2025)
PLOP: Cost-Based Placement of Semantic Operators in Hybrid Query Plans
by: Mang, Qiuyang, et al.
Published: (2026)
by: Mang, Qiuyang, et al.
Published: (2026)
VeriMinder: Mitigating Analytical Vulnerabilities in NL2SQL
by: Mohole, Shubham, et al.
Published: (2025)
by: Mohole, Shubham, et al.
Published: (2025)
Is Agent Memory a Database? Rethinking Data Foundations for Long-Term AI Agent Memory
by: Orogat, Abdelghny, et al.
Published: (2026)
by: Orogat, Abdelghny, et al.
Published: (2026)
Real-Time Health Analytics Using Ontology-Driven Complex Event Processing and LLM Reasoning: A Tuberculosis Case Study
by: Chandra, Ritesh, et al.
Published: (2025)
by: Chandra, Ritesh, et al.
Published: (2025)
Interactive Data Harmonization with LLM Agents: Opportunities and Challenges
by: Santos, Aécio, et al.
Published: (2025)
by: Santos, Aécio, et al.
Published: (2025)
Blue Data Intelligence Layer: Streaming Data and Agents for Multi-source Multi-modal Data-Centric Applications
by: Aminnaseri, Moin, et al.
Published: (2026)
by: Aminnaseri, Moin, et al.
Published: (2026)
A Survey of Data Agents: Emerging Paradigm or Overstated Hype?
by: Zhu, Yizhang, et al.
Published: (2025)
by: Zhu, Yizhang, et al.
Published: (2025)
Transparent, Evaluable, and Accessible Data Agents: A Proof-of-Concept Framework
by: Bahador, Nooshin
Published: (2025)
by: Bahador, Nooshin
Published: (2025)
Managing FAIR Knowledge Graphs as Polyglot Data End Points: A Benchmark based on the rdf2pg Framework and Plant Biology Data
by: Brandizi, Marco, et al.
Published: (2025)
by: Brandizi, Marco, et al.
Published: (2025)
Evaluation of Pipelines for Data Integration into Knowledge Graphs
by: Hofer, Marvin, et al.
Published: (2026)
by: Hofer, Marvin, et al.
Published: (2026)
HoneyBee: A Scalable Modular Framework for Creating Multimodal Oncology Datasets with Foundational Embedding Models
by: Tripathi, Aakash, et al.
Published: (2024)
by: Tripathi, Aakash, et al.
Published: (2024)
Autonomous Data Processing using Meta-Agents
by: Khurana, Udayan
Published: (2026)
by: Khurana, Udayan
Published: (2026)
Knowledge Graph-Guided Multi-Agent Distillation for Reliable Industrial Question Answering with Datasets
by: Pan, Jiqun, et al.
Published: (2025)
by: Pan, Jiqun, et al.
Published: (2025)
KIF: A Wikidata-Based Framework for Integrating Heterogeneous Knowledge Sources
by: Lima, Guilherme, et al.
Published: (2024)
by: Lima, Guilherme, et al.
Published: (2024)
KGpipe: Generation and Evaluation of Pipelines for Data Integration into Knowledge Graphs
by: Hofer, Marvin, et al.
Published: (2025)
by: Hofer, Marvin, et al.
Published: (2025)
Integrating Activity Predictions in Knowledge Graphs
by: Hare, Forrest, et al.
Published: (2025)
by: Hare, Forrest, et al.
Published: (2025)
Deontic Knowledge Graphs for Privacy Compliance in Multimodal Disaster Data Sharing
by: Echenim, Kelvin Uzoma, et al.
Published: (2026)
by: Echenim, Kelvin Uzoma, et al.
Published: (2026)
Similar Items
-
Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees
by: Zeighami, Sepanta, et al.
Published: (2025) -
LLM-Powered Proactive Data Systems
by: Zeighami, Sepanta, et al.
Published: (2025) -
Supporting Our AI Overlords: Redesigning Data Systems to be Agent-First
by: Liu, Shu, et al.
Published: (2025) -
Task Cascades for Efficient Unstructured Data Processing
by: Shankar, Shreya, et al.
Published: (2026) -
Featurized-Decomposition Join: Low-Cost Semantic Joins with Guarantees
by: Zeighami, Sepanta, et al.
Published: (2025)