Saved in:
| Main Authors: | Lee, Shane, Ng, Stella |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.04905 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Jellyfish: A Large Language Model for Data Preprocessing
by: Zhang, Haochen, et al.
Published: (2023)
by: Zhang, Haochen, et al.
Published: (2023)
FLORA: Unsupervised Knowledge Graph Alignment by Fuzzy Logic
by: Peng, Yiwen, et al.
Published: (2025)
by: Peng, Yiwen, et al.
Published: (2025)
Auto-FP: An Experimental Study of Automated Feature Preprocessing for Tabular Data
by: Qi, Danrui, et al.
Published: (2023)
by: Qi, Danrui, et al.
Published: (2023)
Bootstrapping Learned Cost Models with Synthetic SQL Queries
by: Nidd, Michael, et al.
Published: (2025)
by: Nidd, Michael, et al.
Published: (2025)
Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees
by: Zeighami, Sepanta, et al.
Published: (2025)
by: Zeighami, Sepanta, et al.
Published: (2025)
CONCERTO: Complex Query Execution Mechanism-Aware Learned Cost Estimation
by: Zhang, Kaixin, et al.
Published: (2024)
by: Zhang, Kaixin, et al.
Published: (2024)
Transformer-Gather, Fuzzy-Reconsider: A Scalable Hybrid Framework for Entity Resolution
by: Sharifi, Mohammadreza, et al.
Published: (2025)
by: Sharifi, Mohammadreza, et al.
Published: (2025)
100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight Proxy Models
by: Chung, Yeounoh, et al.
Published: (2026)
by: Chung, Yeounoh, et al.
Published: (2026)
EHR-SeqSQL : A Sequential Text-to-SQL Dataset For Interactively Exploring Electronic Health Records
by: Ryu, Jaehee, et al.
Published: (2024)
by: Ryu, Jaehee, et al.
Published: (2024)
NEXT-EVAL: Next Evaluation of Traditional and LLM Web Data Record Extraction
by: Kim, Soyeon, et al.
Published: (2025)
by: Kim, Soyeon, et al.
Published: (2025)
AI-Driven Research for Databases
by: Cheng, Audrey, et al.
Published: (2026)
by: Cheng, Audrey, et al.
Published: (2026)
Interactive Ontology Matching with Cost-Efficient Learning
by: Cheng, Bin, et al.
Published: (2024)
by: Cheng, Bin, et al.
Published: (2024)
Cross-Representation Benchmarking in Time-Series Electronic Health Records for Clinical Outcome Prediction
by: Chen, Tianyi, et al.
Published: (2025)
by: Chen, Tianyi, et al.
Published: (2025)
CoddLLM: Empowering Large Language Models for Data Analytics
by: Zhang, Jiani, et al.
Published: (2025)
by: Zhang, Jiani, et al.
Published: (2025)
Grounding Realizable Entities
by: Rabenberg, Michael, et al.
Published: (2024)
by: Rabenberg, Michael, et al.
Published: (2024)
Online Detection of Anomalies in Temporal Knowledge Graphs with Interpretability
by: Zhang, Jiasheng, et al.
Published: (2024)
by: Zhang, Jiasheng, et al.
Published: (2024)
Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata
by: Poliakov, Mykhailo, et al.
Published: (2024)
by: Poliakov, Mykhailo, et al.
Published: (2024)
EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing
by: Zhu, Yizhang, et al.
Published: (2025)
by: Zhu, Yizhang, et al.
Published: (2025)
The Interpretability Analysis of the Model Can Bring Improvements to the Text-to-SQL Task
by: Zhang, Cong
Published: (2025)
by: Zhang, Cong
Published: (2025)
VERSA: Verified Event Data Format for Reliable Soccer Analytics
by: Jo, Geonhee, et al.
Published: (2026)
by: Jo, Geonhee, et al.
Published: (2026)
Enabling Secure and Ephemeral AI Workloads in Data Mesh Environments
by: Patel, Chinkit, et al.
Published: (2025)
by: Patel, Chinkit, et al.
Published: (2025)
A2RAG: Adaptive Agentic Graph Retrieval for Cost-Aware and Reliable Reasoning
by: Liu, Jiate, et al.
Published: (2026)
by: Liu, Jiate, et al.
Published: (2026)
Reqo: A Comprehensive Learning-Based Cost Model for Robust and Explainable Query Optimization
by: Chang, Baoming, et al.
Published: (2025)
by: Chang, Baoming, et al.
Published: (2025)
SABER: A SQL-Compatible Semantic Document Processing System Based on Extended Relational Algebra
by: Lee, Changjae, et al.
Published: (2025)
by: Lee, Changjae, et al.
Published: (2025)
TinySQL: A Progressive Text-to-SQL Dataset for Mechanistic Interpretability Research
by: Harrasse, Abir, et al.
Published: (2025)
by: Harrasse, Abir, et al.
Published: (2025)
KGPrune: a Web Application to Extract Subgraphs of Interest from Wikidata with Analogical Pruning
by: Monnin, Pierre, et al.
Published: (2024)
by: Monnin, Pierre, et al.
Published: (2024)
Cost-Based Semantics for Querying Inconsistent Weighted Knowledge Bases
by: Bienvenu, Meghyn, et al.
Published: (2024)
by: Bienvenu, Meghyn, et al.
Published: (2024)
Towards Improving Interpretability of Language Model Generation through a Structured Knowledge Discovery Approach
by: Liu, Shuqi, et al.
Published: (2025)
by: Liu, Shuqi, et al.
Published: (2025)
Knowledge Graph Construction for Stock Markets with LLM-Based Explainable Reasoning
by: Lee, Cheonsol, et al.
Published: (2025)
by: Lee, Cheonsol, et al.
Published: (2025)
Towards Scalable Schema Mapping using Large Language Models
by: Buss, Christopher, et al.
Published: (2025)
by: Buss, Christopher, et al.
Published: (2025)
Scorecards for Synthetic Medical Data Evaluation and Reporting
by: Zamzmi, Ghada, et al.
Published: (2024)
by: Zamzmi, Ghada, et al.
Published: (2024)
Retrieval-Augmented Generation of Ontologies from Relational Databases
by: Nayyeri, Mojtaba, et al.
Published: (2025)
by: Nayyeri, Mojtaba, et al.
Published: (2025)
Trustworthy AI in the Agentic Lakehouse: from Concurrency to Governance
by: Tagliabue, Jacopo, et al.
Published: (2025)
by: Tagliabue, Jacopo, et al.
Published: (2025)
Efficiently Sampling Interval Patterns from Numerical Databases
by: Bekkoucha, Djawad, et al.
Published: (2025)
by: Bekkoucha, Djawad, et al.
Published: (2025)
Database Normalization via Dual-LLM Self-Refinement
by: Jo, Eunjae, et al.
Published: (2025)
by: Jo, Eunjae, et al.
Published: (2025)
Graphy'our Data: Towards End-to-End Modeling, Exploring and Generating Report from Raw Data
by: Lai, Longbin, et al.
Published: (2025)
by: Lai, Longbin, et al.
Published: (2025)
Efficient Dynamic Clustering: Capturing Patterns from Historical Cluster Evolution
by: Gu, Binbin, et al.
Published: (2022)
by: Gu, Binbin, et al.
Published: (2022)
LaDe: The First Comprehensive Last-mile Delivery Dataset from Industry
by: Wu, Lixia, et al.
Published: (2023)
by: Wu, Lixia, et al.
Published: (2023)
OVT-MLCS: An Online Visual Tool for MLCS Mining from Long or Big Sequences
by: Wang, Zhi, et al.
Published: (2026)
by: Wang, Zhi, et al.
Published: (2026)
PrepBench: How Far Are We from Natural-Language-Driven Data Preparation?
by: Xu, Jingzhe, et al.
Published: (2026)
by: Xu, Jingzhe, et al.
Published: (2026)
Similar Items
-
Jellyfish: A Large Language Model for Data Preprocessing
by: Zhang, Haochen, et al.
Published: (2023) -
FLORA: Unsupervised Knowledge Graph Alignment by Fuzzy Logic
by: Peng, Yiwen, et al.
Published: (2025) -
Auto-FP: An Experimental Study of Automated Feature Preprocessing for Tabular Data
by: Qi, Danrui, et al.
Published: (2023) -
Bootstrapping Learned Cost Models with Synthetic SQL Queries
by: Nidd, Michael, et al.
Published: (2025) -
Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees
by: Zeighami, Sepanta, et al.
Published: (2025)