Saved in:
Bibliographic Details
Main Authors: Surana, Shraddha, Srinivasan, Ashwin, Bain, Michael
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.14488
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910044352675840
author Surana, Shraddha
Srinivasan, Ashwin
Bain, Michael
author_facet Surana, Shraddha
Srinivasan, Ashwin
Bain, Michael
contents Engineering information systems for scientific data analysis presents significant challenges: complex workflows requiring exploration of large solution spaces, close collaboration with domain specialists, and the need for maintainable, interpretable implementations. Traditional manual development is time-consuming, while "No Code" approaches using large language models (LLMs) often produce unreliable systems. We present iProg, a tool implementing Interactive Structured Inductive Programming. iProg employs a variant of a '2-way Intelligibility' communication protocol to constrain collaborative system construction by a human and an LLM. Specifically, given a natural-language description of the overall data analysis task, iProg uses an LLM to first identify an appropriate decomposition of the problem into a declarative representation, expressed as a Data Flow Diagram (DFD). In a second phase, iProg then uses an LLM to generate code for each DFD process. In both stages, human feedback, mediated through the constructs provided by the communication protocol, is used to verify LLMs' outputs. We evaluate iProg extensively on two published scientific collaborations (astrophysics and biochemistry), demonstrating that it is possible to identify appropriate system decompositions and construct end-to-end information systems with better performance, higher code quality, and order-of-magnitude faster development compared to Low Code/No Code alternatives. The tool is available at: https://shraddhasurana.github.io/dhaani/
format Preprint
id arxiv_https___arxiv_org_abs_2503_14488
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Engineering Systems for Data Analysis Using Interactive Structured Inductive Programming
Surana, Shraddha
Srinivasan, Ashwin
Bain, Michael
Artificial Intelligence
Software Engineering
H.4.1; D.2.2; I.2.2
Engineering information systems for scientific data analysis presents significant challenges: complex workflows requiring exploration of large solution spaces, close collaboration with domain specialists, and the need for maintainable, interpretable implementations. Traditional manual development is time-consuming, while "No Code" approaches using large language models (LLMs) often produce unreliable systems. We present iProg, a tool implementing Interactive Structured Inductive Programming. iProg employs a variant of a '2-way Intelligibility' communication protocol to constrain collaborative system construction by a human and an LLM. Specifically, given a natural-language description of the overall data analysis task, iProg uses an LLM to first identify an appropriate decomposition of the problem into a declarative representation, expressed as a Data Flow Diagram (DFD). In a second phase, iProg then uses an LLM to generate code for each DFD process. In both stages, human feedback, mediated through the constructs provided by the communication protocol, is used to verify LLMs' outputs. We evaluate iProg extensively on two published scientific collaborations (astrophysics and biochemistry), demonstrating that it is possible to identify appropriate system decompositions and construct end-to-end information systems with better performance, higher code quality, and order-of-magnitude faster development compared to Low Code/No Code alternatives. The tool is available at: https://shraddhasurana.github.io/dhaani/
title Engineering Systems for Data Analysis Using Interactive Structured Inductive Programming
topic Artificial Intelligence
Software Engineering
H.4.1; D.2.2; I.2.2
url https://arxiv.org/abs/2503.14488