:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yuvraj, Pritish, Devarakonda, Siva
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2509.18400
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Benchmarking Harmonized Tariff Schedule Classification Models
by: Judy, Bryce
Published: (2024)

Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
by: Bandarkar, Lucas, et al.
Published: (2024)

REGAL: A Registry-Driven Architecture for Deterministic Grounding of Agentic AI in Enterprise Telemetry
by: Agrawal, Yuvraj
Published: (2026)

Benchmarking and Adapting On-Device LLMs for Clinical Decision Support
by: Munim, Alif, et al.
Published: (2025)

DistShap: Scalable GNN Explanations with Distributed Shapley Values
by: Akkas, Selahattin, et al.
Published: (2025)

A Deterministic Agentic Workflow for HS Tariff Classification: Multi-Dimensional Rule Reasoning with Interpretable Decisions
by: Zhang, Yu, et al.
Published: (2026)

AdaptEval: A Benchmark for Evaluating Large Language Models on Code Snippet Adaptation
by: Zhang, Tanghaoran, et al.
Published: (2026)

RAIL in the Wild: Operationalizing Responsible AI Evaluation Using Anthropic's Value Dataset
by: Verma, Sumit, et al.
Published: (2025)

Can We Make Code Green? Understanding Trade-Offs in LLMs vs. Human Code Optimizations
by: Rani, Pooja, et al.
Published: (2025)

Classification-Based Automatic HDL Code Generation Using LLMs
by: Sun, Wenhao, et al.
Published: (2024)

ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code
by: Hua, Tianyu, et al.
Published: (2025)

Hallucination by Code Generation LLMs: Taxonomy, Benchmarks, Mitigation, and Challenges
by: Lee, Yunseo, et al.
Published: (2025)

Benchmarking Multimodal LLMs on Code Generation for Complex Interactive Webpages
by: Wu, Fan, et al.
Published: (2026)

PythonSaga: Redefining the Benchmark to Evaluate Code Generating LLMs
by: Yadav, Ankit, et al.
Published: (2024)

Adapting Multilingual LLMs to Low-Resource Languages with Knowledge Graphs via Adapters
by: Gurgurov, Daniil, et al.
Published: (2024)

Multimodal Approach for Harmonized System Code Prediction
by: Amel, Otmane, et al.
Published: (2024)

When Developer Aid Becomes Security Debt: A Systematic Analysis of Insecure Behaviors in LLM Coding Agents
by: Kozak, Matous, et al.
Published: (2025)

ACT: Bridging the Gap in Code Translation through Synthetic Data Generation & Adaptive Training
by: Saxena, Shreya, et al.
Published: (2025)

Adapting LLMs to Time Series Forecasting via Temporal Heterogeneity Modeling and Semantic Alignment
by: Sun, Yanru, et al.
Published: (2025)

Benchmarking LLMs for Fine-Grained Code Review with Enriched Context in Practice
by: Hu, Ruida, et al.
Published: (2025)

Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering
by: Alebachew, Yoseph Berhanu, et al.
Published: (2026)

Harmonic LLMs are Trustworthy
by: Kersting, Nicholas S., et al.
Published: (2024)

EduAdapt: A Question Answer Benchmark Dataset for Evaluating Grade-Level Adaptability in LLMs
by: Naeem, Numaan, et al.
Published: (2025)

ATLAS: Adaptive Trading with LLM AgentS Through Dynamic Prompt Optimization and Multi-Agent Coordination
by: Papadakis, Charidimos, et al.
Published: (2025)

Adapting LLMs for Minimal-edit Grammatical Error Correction
by: Staruch, Ryszard, et al.
Published: (2025)

Are LLMs Ready for TOON? Benchmarking Structural Correctness-Sustainability Trade-offs in Novel Structured Output Formats
by: Masciari, Elio, et al.
Published: (2026)

Smaller = Weaker? Benchmarking Robustness of Quantized LLMs in Code Generation
by: Fang, Sen, et al.
Published: (2025)

Automating Code Adaptation for MLOps -- A Benchmarking Study on LLMs
by: Patel, Harsh, et al.
Published: (2024)

Drawing Pandas: A Benchmark for LLMs in Generating Plotting Code
by: Galimzyanov, Timur, et al.
Published: (2024)

SensorBench: Benchmarking LLMs in Coding-Based Sensor Processing
by: Quan, Pengrui, et al.
Published: (2024)

Humans and LLMs Diverge on Probabilistic Inferences
by: Kamath, Gaurav, et al.
Published: (2026)

Feature Selection Empowered BERT for Detection of Hate Speech with Vocabulary Augmentation
by: Desai, Pritish N., et al.
Published: (2025)

EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents
by: Zala, Abhay, et al.
Published: (2024)

Rectifier: Code Translation with Corrector via LLMs
by: Yin, Xin, et al.
Published: (2024)

HardSecBench: Benchmarking the Security Awareness of LLMs for Hardware Code Generation
by: Chen, Qirui, et al.
Published: (2026)

Validate Your Authority: Benchmarking LLMs on Multi-Label Precedent Treatment Classification
by: Demir, M. Mikail, et al.
Published: (2026)

Do LLMs Really Adapt to Domains? An Ontology Learning Perspective
by: Mai, Huu Tan, et al.
Published: (2024)

Quantum Artificial Intelligence for Mission-Critical Systems: Foundations, Architectural Elements, and Future Directions
by: Sai, Siva, et al.
Published: (2025)

Unsupervised Learning of Harmonic Analysis Based on Neural HSMM with Code Quality Templates
by: Uehara, Yui
Published: (2024)

SayCoNav: Utilizing Large Language Models for Adaptive Collaboration in Decentralized Multi-Robot Navigation
by: Rajvanshi, Abhinav, et al.
Published: (2025)