Saved in:
| Main Authors: | Sadia, Mushtari, Chowdhury, Amrita Roy, Chen, Ang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.14601 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SQUiD: Synthesizing Relational Databases from Unstructured Text
by: Sadia, Mushtari, et al.
Published: (2025)
by: Sadia, Mushtari, et al.
Published: (2025)
Cortex AISQL: A Production SQL Engine for Unstructured Data
by: Liskowski, Paweł, et al.
Published: (2025)
by: Liskowski, Paweł, et al.
Published: (2025)
Process Mining for Unstructured Data: Challenges and Research Directions
by: Koschmider, Agnes, et al.
Published: (2023)
by: Koschmider, Agnes, et al.
Published: (2023)
UQE: A Query Engine for Unstructured Databases
by: Dai, Hanjun, et al.
Published: (2024)
by: Dai, Hanjun, et al.
Published: (2024)
The Design of an LLM-powered Unstructured Analytics System
by: Anderson, Eric, et al.
Published: (2024)
by: Anderson, Eric, et al.
Published: (2024)
Health System Scale Semantic Search Across Unstructured Clinical Notes
by: Mutinda, Faith Wavinya, et al.
Published: (2026)
by: Mutinda, Faith Wavinya, et al.
Published: (2026)
Mining Weighted Sequential Patterns in Incremental Uncertain Databases
by: Roy, Kashob Kumar, et al.
Published: (2024)
by: Roy, Kashob Kumar, et al.
Published: (2024)
Natural Language Interfaces for Spatial and Temporal Databases: A Comprehensive Overview of Methods, Taxonomy, and Future Directions
by: Acharja, Samya, et al.
Published: (2026)
by: Acharja, Samya, et al.
Published: (2026)
DSL-R1: From SQL to DSL for Training Retrieval Agents across Structured and Unstructured Data with Reinforcement Learning
by: Hu, Yunhai, et al.
Published: (2026)
by: Hu, Yunhai, et al.
Published: (2026)
OpenMLDB: A Real-Time Relational Data Feature Computation System for Online ML
by: Zhou, Xuanhe, et al.
Published: (2025)
by: Zhou, Xuanhe, et al.
Published: (2025)
Data Science: a Natural Ecosystem
by: Porcu, Emilio, et al.
Published: (2025)
by: Porcu, Emilio, et al.
Published: (2025)
TAIJI: MCP-based Multi-Modal Data Analytics on Data Lakes
by: Zhang, Chao, et al.
Published: (2025)
by: Zhang, Chao, et al.
Published: (2025)
Blue Data Intelligence Layer: Streaming Data and Agents for Multi-source Multi-modal Data-Centric Applications
by: Aminnaseri, Moin, et al.
Published: (2026)
by: Aminnaseri, Moin, et al.
Published: (2026)
CMDBench: A Benchmark for Coarse-to-fine Multimodal Data Discovery in Compound AI Systems
by: Feng, Yanlin, et al.
Published: (2024)
by: Feng, Yanlin, et al.
Published: (2024)
Data Quality Awareness: A Journey from Traditional Data Management to Data Science Systems
by: Dong, Sijie, et al.
Published: (2024)
by: Dong, Sijie, et al.
Published: (2024)
Common Data Format (CDF): A Standardized Format for Match-Data in Football (Soccer)
by: Anzer, Gabriel, et al.
Published: (2025)
by: Anzer, Gabriel, et al.
Published: (2025)
AgenticData: An Agentic Data Analytics System for Heterogeneous Data
by: Sun, Ji, et al.
Published: (2025)
by: Sun, Ji, et al.
Published: (2025)
A Survey of Data Agents: Emerging Paradigm or Overstated Hype?
by: Zhu, Yizhang, et al.
Published: (2025)
by: Zhu, Yizhang, et al.
Published: (2025)
Computationally Intensive Research: Advancing a Role for Secondary Analysis of Qualitative Data
by: Mohajeri, Kaveh, et al.
Published: (2025)
by: Mohajeri, Kaveh, et al.
Published: (2025)
RUST-BENCH: Benchmarking LLM Reasoning on Unstructured Text within Structured Tables
by: Abhyankar, Nikhil, et al.
Published: (2025)
by: Abhyankar, Nikhil, et al.
Published: (2025)
Shapley Value Computation in Ontology-Mediated Query Answering
by: Bienvenu, Meghyn, et al.
Published: (2024)
by: Bienvenu, Meghyn, et al.
Published: (2024)
LLM/Agent-as-Data-Analyst: A Survey
by: Tang, Zirui, et al.
Published: (2025)
by: Tang, Zirui, et al.
Published: (2025)
Powering In-Database Dynamic Model Slicing for Structured Data Analytics
by: Zeng, Lingze, et al.
Published: (2024)
by: Zeng, Lingze, et al.
Published: (2024)
NFDI4DSO: Towards a BFO Compliant Ontology for Data Science
by: Gesese, Genet Asefa, et al.
Published: (2024)
by: Gesese, Genet Asefa, et al.
Published: (2024)
ARCADE: A Real-Time Data System for Hybrid and Continuous Query Processing across Diverse Data Modalities
by: Yang, Jingyi, et al.
Published: (2025)
by: Yang, Jingyi, et al.
Published: (2025)
Towards Controllable Time Series Generation
by: Bao, Yifan, et al.
Published: (2024)
by: Bao, Yifan, et al.
Published: (2024)
PDX: A Data Layout for Vector Similarity Search
by: Kuffo, Leonardo, et al.
Published: (2025)
by: Kuffo, Leonardo, et al.
Published: (2025)
A System and Benchmark for LLM-based Q&A on Heterogeneous Data
by: Fokoue, Achille, et al.
Published: (2024)
by: Fokoue, Achille, et al.
Published: (2024)
ZeroCard: Cardinality Estimation with Zero Dependence on Target Databases -- No Data, No Query, No Retraining
by: Xu, Xianghong, et al.
Published: (2025)
by: Xu, Xianghong, et al.
Published: (2025)
Exploiting Formal Concept Analysis for Data Modeling in Data Lakes
by: Bendimerad, Anes, et al.
Published: (2024)
by: Bendimerad, Anes, et al.
Published: (2024)
AI-Driven Generation of Data Contracts in Modern Data Engineering Systems
by: Bhoite, Harshraj
Published: (2025)
by: Bhoite, Harshraj
Published: (2025)
LEDD: Large Language Model-Empowered Data Discovery in Data Lakes
by: An, Qi, et al.
Published: (2025)
by: An, Qi, et al.
Published: (2025)
Declarative Privacy-Preserving Inference Queries
by: Guan, Hong, et al.
Published: (2024)
by: Guan, Hong, et al.
Published: (2024)
A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges
by: Singh, Aditi, et al.
Published: (2024)
by: Singh, Aditi, et al.
Published: (2024)
Managing FAIR Knowledge Graphs as Polyglot Data End Points: A Benchmark based on the rdf2pg Framework and Plant Biology Data
by: Brandizi, Marco, et al.
Published: (2025)
by: Brandizi, Marco, et al.
Published: (2025)
Transparent, Evaluable, and Accessible Data Agents: A Proof-of-Concept Framework
by: Bahador, Nooshin
Published: (2025)
by: Bahador, Nooshin
Published: (2025)
BDI-Kit Demo: A Toolkit for Programmable and Conversational Data Harmonization
by: Lopez, Roque, et al.
Published: (2026)
by: Lopez, Roque, et al.
Published: (2026)
A Semantic Approach for Big Data Exploration in Industry 4.0
by: Berges, Idoia, et al.
Published: (2024)
by: Berges, Idoia, et al.
Published: (2024)
Serving Deep Learning Model in Relational Databases
by: Zhou, Lixi, et al.
Published: (2023)
by: Zhou, Lixi, et al.
Published: (2023)
PLM4NDV: Minimizing Data Access for Number of Distinct Values Estimation with Pre-trained Language Models
by: Xu, Xianghong, et al.
Published: (2025)
by: Xu, Xianghong, et al.
Published: (2025)
Similar Items
-
SQUiD: Synthesizing Relational Databases from Unstructured Text
by: Sadia, Mushtari, et al.
Published: (2025) -
Cortex AISQL: A Production SQL Engine for Unstructured Data
by: Liskowski, Paweł, et al.
Published: (2025) -
Process Mining for Unstructured Data: Challenges and Research Directions
by: Koschmider, Agnes, et al.
Published: (2023) -
UQE: A Query Engine for Unstructured Databases
by: Dai, Hanjun, et al.
Published: (2024) -
The Design of an LLM-powered Unstructured Analytics System
by: Anderson, Eric, et al.
Published: (2024)