:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shbita, Basel, Gentile, Anna Lisa, Zhang, Bing, An, Sungeun, Thakur, Shailja, Asthana, Shubhi, Zhou, Yi, Surendran, Saptha, Ahmed, Farhan, Kulkarni, Rohan, Ong, Yuya Jeremy, DeLuca, Chad, Patel, Hima
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2604.23027
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

STaD: Scaffolded Task Design for Identifying Compositional Skill Gaps in LLMs
by: An, Sungeun, et al.
Published: (2026)

STRIDE: A Systematic Framework for Selecting AI Modalities -- Agentic AI, AI Assistants, or LLM Calls
by: Asthana, Shubhi, et al.
Published: (2025)

MermaidSeqBench: An Evaluation Benchmark for NL-to-Mermaid Sequence Diagram Generation
by: Shbita, Basel, et al.
Published: (2025)

Runtime-Structured Task Decomposition for Agentic Coding Systems
by: Asthana, Shubhi, et al.
Published: (2026)

OneShield -- the Next Generation of LLM Guardrails
by: DeLuca, Chad, et al.
Published: (2025)

LogitScope: A Framework for Analyzing LLM Uncertainty Through Information Metrics
by: Ahmed, Farhan, et al.
Published: (2026)

Deploying Privacy Guardrails for LLMs: A Comparative Analysis of Real-World Applications
by: Asthana, Shubhi, et al.
Published: (2025)

Adaptive Decoding via Test-Time Policy Learning for Self-Improving Generation
by: Bhardwaj, Asmita, et al.
Published: (2026)

WikiVQABench: A Knowledge-Grounded Visual Question Answering Benchmark from Wikipedia and Wikidata
by: Shbita, Basel, et al.
Published: (2026)

LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface
by: Hind, Michael, et al.
Published: (2026)

Generating Verifiable Chain of Thoughts from Exection-Traces
by: Thakur, Shailja, et al.
Published: (2025)

Evaluating Ill-Defined Tasks in Large Language Models
by: Zhou, Yi, et al.
Published: (2026)

Backprompting: Leveraging Synthetic Production Data for Health Advice Guardrails
by: Cheng, Kellen Tan, et al.
Published: (2025)

Adaptive PII Mitigation Framework for Large Language Models
by: Asthana, Shubhi, et al.
Published: (2025)

Projeto e Metamorfose: Contribuições de Gilberto Velho para os Estudos sobre Carreiras
by: Gabriela DeLuca
Published: (2016)

Inked Careers: Tattooing Professional Paths
by: Gabriela DeLuca
Published: (2016)

Letter to the Editor: The Availability of Midwifery Care in Rural United States Communities
by: Myra DeLuca
Published: (2025)

Material Selection. B-1 Evaluating and Selecting Learning Materials, Document No. 10d, Revised. Independent Study Training Material for Professional Supervisory Competencies.
by: DeLuca, Joan
Published: (1975)

Detecting and Mitigating Memorization in Diffusion Models through Anisotropy of the Log-Probability
by: Asthana, Rohan, et al.
Published: (2026)

On being a forest soil scientist—Reflections at the 14th North American Forest Soils Conference
by: Thomas H. DeLuca
Published: (2024)

Aging: An Annotated Guide to Government Publications. The University of Connecticut Library Bibliography Series, Number 3.
by: DeLuca, L., et al.
Published: (1975)

NC-Bench: An LLM Benchmark for Evaluating Conversational Competence
by: Moore, Robert J., et al.
Published: (2026)

FedDebug: Systematic Debugging for Federated Learning Applications
by: Gill, Waris, et al.
Published: (2023)

Data-Prep-Kit: getting your data ready for LLM application development
by: Wood, David, et al.
Published: (2024)

Mixing Condition Numbers and Oracles for Accurate Floating-point Debugging
by: Kulkarni, Bhargav, et al.
Published: (2025)

CIFE: Code Instruction-Following Evaluation
by: Gunnu, Sravani, et al.
Published: (2025)

ILAEDA: An Imitation Learning Based Approach for Automatic Exploratory Data Analysis
by: Manatkar, Abhijit, et al.
Published: (2024)

Environmental policy behavioral spillovers: The impact of California's single‐use carryout bag ban on the use of unregulated single‐use plastics
by: Sungeun Yoon, et al.
Published: (2024)

Enterprise Benchmarks for Large Language Model Evaluation
by: Zhang, Bing, et al.
Published: (2024)

Fixing Hardware Security Bugs with Large Language Models
by: Ahmad, Baleegh, et al.
Published: (2023)

From challenge to innovation: A grassroots study of teachers’ classroom assessment innovations
by: Christopher DeLuca, et al.
Published: (2024)

How hermeneutics can guide grading in integrated STEAM education: An evidence‐informed perspective
by: Christopher DeLuca, et al.
Published: (2024)

Balancing disciplinary and integrated learning: How exemplary STEM teachers negotiate tensions of practice
by: Michelle Dubek, et al.
Published: (2024)

Le dysfonctionnement socio-spatial des grands ensembles en Algérie: technique de l’analyse wayfinding par méthode “movement traces” et l’analyse morphologique (syntaxe spatiale) par logiciel “depthmap1”
by: Amara Hima
Published: (2018)

Dextr: Zero-Shot Neural Architecture Search with Singular Value Decomposition and Extrinsic Curvature
by: Asthana, Rohan, et al.
Published: (2025)

An Automatic Prompt Generation System for Tabular Data Tasks
by: Akella, Ashlesha, et al.
Published: (2024)

AutoChip: Automating HDL Generation Using LLM Feedback
by: Thakur, Shailja, et al.
Published: (2023)

Automatically Improving LLM-based Verilog Generation using EDA Tool Feedback
by: Blocklove, Jason, et al.
Published: (2024)

Peak‐to‐average power ratio reduction of orthogonal frequency division multiplexing signals using improved salp swarm optimization‐based partial transmit sequence model
by: Vandana Tripathi, et al.
Published: (2025)

Internal Resorption: A Retrospective Cone‐Beam Computed Tomography Analysis of 50 Cases With Outcome Assessment
by: Schuyler DeLuca, et al.
Published: (2026)