Saved in:
| Main Authors: | Shbita, Basel, Gentile, Anna Lisa, Zhang, Bing, An, Sungeun, Thakur, Shailja, Asthana, Shubhi, Zhou, Yi, Surendran, Saptha, Ahmed, Farhan, Kulkarni, Rohan, Ong, Yuya Jeremy, DeLuca, Chad, Patel, Hima |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.23027 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
STaD: Scaffolded Task Design for Identifying Compositional Skill Gaps in LLMs
by: An, Sungeun, et al.
Published: (2026)
by: An, Sungeun, et al.
Published: (2026)
STRIDE: A Systematic Framework for Selecting AI Modalities -- Agentic AI, AI Assistants, or LLM Calls
by: Asthana, Shubhi, et al.
Published: (2025)
by: Asthana, Shubhi, et al.
Published: (2025)
MermaidSeqBench: An Evaluation Benchmark for NL-to-Mermaid Sequence Diagram Generation
by: Shbita, Basel, et al.
Published: (2025)
by: Shbita, Basel, et al.
Published: (2025)
Runtime-Structured Task Decomposition for Agentic Coding Systems
by: Asthana, Shubhi, et al.
Published: (2026)
by: Asthana, Shubhi, et al.
Published: (2026)
OneShield -- the Next Generation of LLM Guardrails
by: DeLuca, Chad, et al.
Published: (2025)
by: DeLuca, Chad, et al.
Published: (2025)
LogitScope: A Framework for Analyzing LLM Uncertainty Through Information Metrics
by: Ahmed, Farhan, et al.
Published: (2026)
by: Ahmed, Farhan, et al.
Published: (2026)
Deploying Privacy Guardrails for LLMs: A Comparative Analysis of Real-World Applications
by: Asthana, Shubhi, et al.
Published: (2025)
by: Asthana, Shubhi, et al.
Published: (2025)
Adaptive Decoding via Test-Time Policy Learning for Self-Improving Generation
by: Bhardwaj, Asmita, et al.
Published: (2026)
by: Bhardwaj, Asmita, et al.
Published: (2026)
WikiVQABench: A Knowledge-Grounded Visual Question Answering Benchmark from Wikipedia and Wikidata
by: Shbita, Basel, et al.
Published: (2026)
by: Shbita, Basel, et al.
Published: (2026)
LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface
by: Hind, Michael, et al.
Published: (2026)
by: Hind, Michael, et al.
Published: (2026)
Generating Verifiable Chain of Thoughts from Exection-Traces
by: Thakur, Shailja, et al.
Published: (2025)
by: Thakur, Shailja, et al.
Published: (2025)
Evaluating Ill-Defined Tasks in Large Language Models
by: Zhou, Yi, et al.
Published: (2026)
by: Zhou, Yi, et al.
Published: (2026)
Backprompting: Leveraging Synthetic Production Data for Health Advice Guardrails
by: Cheng, Kellen Tan, et al.
Published: (2025)
by: Cheng, Kellen Tan, et al.
Published: (2025)
Adaptive PII Mitigation Framework for Large Language Models
by: Asthana, Shubhi, et al.
Published: (2025)
by: Asthana, Shubhi, et al.
Published: (2025)
Projeto e Metamorfose: Contribuições de Gilberto Velho para os Estudos sobre Carreiras
by: Gabriela DeLuca
Published: (2016)
by: Gabriela DeLuca
Published: (2016)
Inked Careers: Tattooing Professional Paths
by: Gabriela DeLuca
Published: (2016)
by: Gabriela DeLuca
Published: (2016)
Letter to the Editor: The Availability of Midwifery Care in Rural United States Communities
by: Myra DeLuca
Published: (2025)
by: Myra DeLuca
Published: (2025)
Material Selection. B-1 Evaluating and Selecting Learning Materials, Document No. 10d, Revised. Independent Study Training Material for Professional Supervisory Competencies.
by: DeLuca, Joan
Published: (1975)
by: DeLuca, Joan
Published: (1975)
Detecting and Mitigating Memorization in Diffusion Models through Anisotropy of the Log-Probability
by: Asthana, Rohan, et al.
Published: (2026)
by: Asthana, Rohan, et al.
Published: (2026)
On being a forest soil scientist—Reflections at the 14th North American Forest Soils Conference
by: Thomas H. DeLuca
Published: (2024)
by: Thomas H. DeLuca
Published: (2024)
Aging: An Annotated Guide to Government Publications. The University of Connecticut Library Bibliography Series, Number 3.
by: DeLuca, L., et al.
Published: (1975)
by: DeLuca, L., et al.
Published: (1975)
NC-Bench: An LLM Benchmark for Evaluating Conversational Competence
by: Moore, Robert J., et al.
Published: (2026)
by: Moore, Robert J., et al.
Published: (2026)
FedDebug: Systematic Debugging for Federated Learning Applications
by: Gill, Waris, et al.
Published: (2023)
by: Gill, Waris, et al.
Published: (2023)
Data-Prep-Kit: getting your data ready for LLM application development
by: Wood, David, et al.
Published: (2024)
by: Wood, David, et al.
Published: (2024)
Mixing Condition Numbers and Oracles for Accurate Floating-point Debugging
by: Kulkarni, Bhargav, et al.
Published: (2025)
by: Kulkarni, Bhargav, et al.
Published: (2025)
CIFE: Code Instruction-Following Evaluation
by: Gunnu, Sravani, et al.
Published: (2025)
by: Gunnu, Sravani, et al.
Published: (2025)
ILAEDA: An Imitation Learning Based Approach for Automatic Exploratory Data Analysis
by: Manatkar, Abhijit, et al.
Published: (2024)
by: Manatkar, Abhijit, et al.
Published: (2024)
Environmental policy behavioral spillovers: The impact of California's single‐use carryout bag ban on the use of unregulated single‐use plastics
by: Sungeun Yoon, et al.
Published: (2024)
by: Sungeun Yoon, et al.
Published: (2024)
Enterprise Benchmarks for Large Language Model Evaluation
by: Zhang, Bing, et al.
Published: (2024)
by: Zhang, Bing, et al.
Published: (2024)
Fixing Hardware Security Bugs with Large Language Models
by: Ahmad, Baleegh, et al.
Published: (2023)
by: Ahmad, Baleegh, et al.
Published: (2023)
From challenge to innovation: A grassroots study of teachers’ classroom assessment innovations
by: Christopher DeLuca, et al.
Published: (2024)
by: Christopher DeLuca, et al.
Published: (2024)
How hermeneutics can guide grading in integrated STEAM education: An evidence‐informed perspective
by: Christopher DeLuca, et al.
Published: (2024)
by: Christopher DeLuca, et al.
Published: (2024)
Balancing disciplinary and integrated learning: How exemplary STEM teachers negotiate tensions of practice
by: Michelle Dubek, et al.
Published: (2024)
by: Michelle Dubek, et al.
Published: (2024)
Le dysfonctionnement socio-spatial des grands ensembles en Algérie: technique de l’analyse wayfinding par méthode “movement traces” et l’analyse morphologique (syntaxe spatiale) par logiciel “depthmap1”
by: Amara Hima
Published: (2018)
by: Amara Hima
Published: (2018)
Dextr: Zero-Shot Neural Architecture Search with Singular Value Decomposition and Extrinsic Curvature
by: Asthana, Rohan, et al.
Published: (2025)
by: Asthana, Rohan, et al.
Published: (2025)
An Automatic Prompt Generation System for Tabular Data Tasks
by: Akella, Ashlesha, et al.
Published: (2024)
by: Akella, Ashlesha, et al.
Published: (2024)
AutoChip: Automating HDL Generation Using LLM Feedback
by: Thakur, Shailja, et al.
Published: (2023)
by: Thakur, Shailja, et al.
Published: (2023)
Automatically Improving LLM-based Verilog Generation using EDA Tool Feedback
by: Blocklove, Jason, et al.
Published: (2024)
by: Blocklove, Jason, et al.
Published: (2024)
Peak‐to‐average power ratio reduction of orthogonal frequency division multiplexing signals using improved salp swarm optimization‐based partial transmit sequence model
by: Vandana Tripathi, et al.
Published: (2025)
by: Vandana Tripathi, et al.
Published: (2025)
Internal Resorption: A Retrospective Cone‐Beam Computed Tomography Analysis of 50 Cases With Outcome Assessment
by: Schuyler DeLuca, et al.
Published: (2026)
by: Schuyler DeLuca, et al.
Published: (2026)
Similar Items
-
STaD: Scaffolded Task Design for Identifying Compositional Skill Gaps in LLMs
by: An, Sungeun, et al.
Published: (2026) -
STRIDE: A Systematic Framework for Selecting AI Modalities -- Agentic AI, AI Assistants, or LLM Calls
by: Asthana, Shubhi, et al.
Published: (2025) -
MermaidSeqBench: An Evaluation Benchmark for NL-to-Mermaid Sequence Diagram Generation
by: Shbita, Basel, et al.
Published: (2025) -
Runtime-Structured Task Decomposition for Agentic Coding Systems
by: Asthana, Shubhi, et al.
Published: (2026) -
OneShield -- the Next Generation of LLM Guardrails
by: DeLuca, Chad, et al.
Published: (2025)