Saved in:
| Main Authors: | Lin, Zhenru, Yao, Yiqun, Yuan, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.01784 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CodeMind: Evaluating Large Language Models for Code Reasoning
by: Liu, Changshu, et al.
Published: (2024)
by: Liu, Changshu, et al.
Published: (2024)
FPMoE: A Sparse Mixture-of-Experts Approach to Functional Code Generation
by: Pham, Loc, et al.
Published: (2026)
by: Pham, Loc, et al.
Published: (2026)
Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs?
by: Liang, Qingyuan, et al.
Published: (2025)
by: Liang, Qingyuan, et al.
Published: (2025)
Assessing Code Understanding in LLMs
by: Laneve, Cosimo, et al.
Published: (2025)
by: Laneve, Cosimo, et al.
Published: (2025)
AutoCode: LLMs as Problem Setters for Competitive Programming
by: Zhou, Shang, et al.
Published: (2025)
by: Zhou, Shang, et al.
Published: (2025)
ECO: Enhanced Code Optimization via Performance-Aware Prompting for Code-LLMs
by: Kim, Su-Hyeon, et al.
Published: (2025)
by: Kim, Su-Hyeon, et al.
Published: (2025)
CodeIF-Bench: Evaluating Instruction-Following Capabilities of Large Language Models in Interactive Code Generation
by: Wang, Peiding, et al.
Published: (2025)
by: Wang, Peiding, et al.
Published: (2025)
Can LLMs Compress (and Decompress)? Evaluating Code Understanding and Execution via Invertibility
by: Maveli, Nickil, et al.
Published: (2026)
by: Maveli, Nickil, et al.
Published: (2026)
CodeV: Empowering LLMs with HDL Generation through Multi-Level Summarization
by: Zhao, Yang, et al.
Published: (2024)
by: Zhao, Yang, et al.
Published: (2024)
VeriEquivBench: An Equivalence Score for Ground-Truth-Free Evaluation of Formally Verifiable Code
by: Zeng, Lingfei, et al.
Published: (2025)
by: Zeng, Lingfei, et al.
Published: (2025)
Lita: Light Agent Uncovers the Agentic Coding Capabilities of LLMs
by: Dai, Hankun, et al.
Published: (2025)
by: Dai, Hankun, et al.
Published: (2025)
Code Repair with LLMs gives an Exploration-Exploitation Tradeoff
by: Tang, Hao, et al.
Published: (2024)
by: Tang, Hao, et al.
Published: (2024)
Is Functional Correctness Enough to Evaluate Code Language Models? Exploring Diversity of Generated Codes
by: Chon, Heejae, et al.
Published: (2024)
by: Chon, Heejae, et al.
Published: (2024)
From Prompts to Performance: Evaluating LLMs for Task-based Parallel Code Generation
by: Bantel, Linus, et al.
Published: (2026)
by: Bantel, Linus, et al.
Published: (2026)
Smaller = Weaker? Benchmarking Robustness of Quantized LLMs in Code Generation
by: Fang, Sen, et al.
Published: (2025)
by: Fang, Sen, et al.
Published: (2025)
AutoMCQ -- Automatically Generate Code Comprehension Questions using GenAI
by: Goodfellow, Martin, et al.
Published: (2025)
by: Goodfellow, Martin, et al.
Published: (2025)
Beyond Code Pairs: Dialogue-Based Data Generation for LLM Code Translation
by: Chen, Le, et al.
Published: (2025)
by: Chen, Le, et al.
Published: (2025)
CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules
by: Le, Hung, et al.
Published: (2023)
by: Le, Hung, et al.
Published: (2023)
Can LLMs Reason About Program Semantics? A Comprehensive Evaluation of LLMs on Formal Specification Inference
by: Le-Cong, Thanh, et al.
Published: (2025)
by: Le-Cong, Thanh, et al.
Published: (2025)
Code Broker: A Multi-Agent System for Automated Code Quality Assessment
by: Attrah, Samer
Published: (2026)
by: Attrah, Samer
Published: (2026)
Bench4HLS: End-to-End Evaluation of LLMs in High-Level Synthesis Code Generation
by: Khan, M Zafir Sadik, et al.
Published: (2026)
by: Khan, M Zafir Sadik, et al.
Published: (2026)
A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond
by: Sun, Qiushi, et al.
Published: (2024)
by: Sun, Qiushi, et al.
Published: (2024)
MCTS-SQL: Light-Weight LLMs can Master the Text-to-SQL through Monte Carlo Tree Search
by: Yuan, Shuozhi, et al.
Published: (2025)
by: Yuan, Shuozhi, et al.
Published: (2025)
Code Simulation Challenges for Large Language Models
by: La Malfa, Emanuele, et al.
Published: (2024)
by: La Malfa, Emanuele, et al.
Published: (2024)
ReFEree: Reference-Free and Fine-Grained Method for Evaluating Factual Consistency in Real-World Code Summarization
by: Bae, Suyoung, et al.
Published: (2026)
by: Bae, Suyoung, et al.
Published: (2026)
CSSG: Measuring Code Similarity with Semantic Graphs
by: Lu, Yiyang, et al.
Published: (2026)
by: Lu, Yiyang, et al.
Published: (2026)
Executing as You Generate: Hiding Execution Latency in LLM Code Generation
by: Sun, Zhensu, et al.
Published: (2026)
by: Sun, Zhensu, et al.
Published: (2026)
PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback
by: Peng, Yun, et al.
Published: (2024)
by: Peng, Yun, et al.
Published: (2024)
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging
by: Shi, Yuling, et al.
Published: (2024)
by: Shi, Yuling, et al.
Published: (2024)
Conditioning LLMs to Generate Code-Switched Text
by: Heredia, Maite, et al.
Published: (2025)
by: Heredia, Maite, et al.
Published: (2025)
LongCodeBench: Evaluating Coding LLMs at 1M Context Windows
by: Rando, Stefano, et al.
Published: (2025)
by: Rando, Stefano, et al.
Published: (2025)
Agentic Code Reasoning
by: Ugare, Shubham, et al.
Published: (2026)
by: Ugare, Shubham, et al.
Published: (2026)
REINFOREST: Reinforcing Semantic Code Similarity for Cross-Lingual Code Search Models
by: Saieva, Anthony, et al.
Published: (2023)
by: Saieva, Anthony, et al.
Published: (2023)
A Preliminary Study of Multilingual Code Language Models for Code Generation Task Using Translated Benchmarks
by: Dandamudi, Rohit, et al.
Published: (2024)
by: Dandamudi, Rohit, et al.
Published: (2024)
DOCE: Finding the Sweet Spot for Execution-Based Code Generation
by: Li, Haau-Sing, et al.
Published: (2024)
by: Li, Haau-Sing, et al.
Published: (2024)
Perish or Flourish? A Holistic Evaluation of Large Language Models for Code Generation in Functional Programming
by: Lang, Nguyet-Anh H., et al.
Published: (2026)
by: Lang, Nguyet-Anh H., et al.
Published: (2026)
MHRC-Bench: A Multilingual Hardware Repository-Level Code Completion benchmark
by: Zou, Qingyun, et al.
Published: (2026)
by: Zou, Qingyun, et al.
Published: (2026)
Linguacodus: A Synergistic Framework for Transformative Code Generation in Machine Learning Pipelines
by: Trofimova, Ekaterina, et al.
Published: (2024)
by: Trofimova, Ekaterina, et al.
Published: (2024)
A Case Study on the Effectiveness of LLMs in Verification with Proof Assistants
by: Bayazıt, Barış, et al.
Published: (2025)
by: Bayazıt, Barış, et al.
Published: (2025)
VisCoder2: Building Multi-Language Visualization Coding Agents
by: Ni, Yuansheng, et al.
Published: (2025)
by: Ni, Yuansheng, et al.
Published: (2025)
Similar Items
-
CodeMind: Evaluating Large Language Models for Code Reasoning
by: Liu, Changshu, et al.
Published: (2024) -
FPMoE: A Sparse Mixture-of-Experts Approach to Functional Code Generation
by: Pham, Loc, et al.
Published: (2026) -
Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs?
by: Liang, Qingyuan, et al.
Published: (2025) -
Assessing Code Understanding in LLMs
by: Laneve, Cosimo, et al.
Published: (2025) -
AutoCode: LLMs as Problem Setters for Competitive Programming
by: Zhou, Shang, et al.
Published: (2025)