Saved in:
| Main Authors: | Wen, Jiaxin, Zhong, Ruiqi, Ke, Pei, Shao, Zhihong, Wang, Hongning, Huang, Minlie |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.04604 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Language Model Decoding as Direct Metrics Optimization
by: Ji, Haozhe, et al.
Published: (2023)
by: Ji, Haozhe, et al.
Published: (2023)
Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning
by: Wen, Jiaxin, et al.
Published: (2024)
by: Wen, Jiaxin, et al.
Published: (2024)
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
by: Zhang, Zhexin, et al.
Published: (2023)
by: Zhang, Zhexin, et al.
Published: (2023)
Language Models Learn to Mislead Humans via RLHF
by: Wen, Jiaxin, et al.
Published: (2024)
by: Wen, Jiaxin, et al.
Published: (2024)
IF-RewardBench: Benchmarking Judge Models for Instruction-Following Evaluation
by: Wen, Bosi, et al.
Published: (2026)
by: Wen, Bosi, et al.
Published: (2026)
HPSS: Heuristic Prompting Strategy Search for LLM Evaluators
by: Wen, Bosi, et al.
Published: (2025)
by: Wen, Bosi, et al.
Published: (2025)
IF-CRITIC: Towards a Fine-Grained LLM Critic for Instruction-Following Evaluation
by: Wen, Bosi, et al.
Published: (2025)
by: Wen, Bosi, et al.
Published: (2025)
Towards Efficient Exact Optimization of Language Model Alignment
by: Ji, Haozhe, et al.
Published: (2024)
by: Ji, Haozhe, et al.
Published: (2024)
Black-Box Prompt Optimization: Aligning Large Language Models without Model Training
by: Cheng, Jiale, et al.
Published: (2023)
by: Cheng, Jiale, et al.
Published: (2023)
AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback
by: Guan, Jian, et al.
Published: (2024)
by: Guan, Jian, et al.
Published: (2024)
RLAR: An Agentic Reward System for Multi-task Reinforcement Learning on Large Language Models
by: Feng, Andrew Zhuoer, et al.
Published: (2026)
by: Feng, Andrew Zhuoer, et al.
Published: (2026)
From Theft to Bomb-Making: The Ripple Effect of Unlearning in Defending Against Jailbreak Attacks
by: Zhang, Zhexin, et al.
Published: (2024)
by: Zhang, Zhexin, et al.
Published: (2024)
Benchmarking Complex Instruction-Following with Multiple Constraints Composition
by: Wen, Bosi, et al.
Published: (2024)
by: Wen, Bosi, et al.
Published: (2024)
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors
by: Zhang, Zhexin, et al.
Published: (2024)
by: Zhang, Zhexin, et al.
Published: (2024)
Think Socially via Cognitive Reasoning
by: Zhou, Jinfeng, et al.
Published: (2025)
by: Zhou, Jinfeng, et al.
Published: (2025)
AutoCode: LLMs as Problem Setters for Competitive Programming
by: Zhou, Shang, et al.
Published: (2025)
by: Zhou, Shang, et al.
Published: (2025)
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning
by: Lin, Honglin, et al.
Published: (2025)
by: Lin, Honglin, et al.
Published: (2025)
AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
by: Cheng, Jiale, et al.
Published: (2024)
by: Cheng, Jiale, et al.
Published: (2024)
Data Selection via Optimal Control for Language Models
by: Gu, Yuxian, et al.
Published: (2024)
by: Gu, Yuxian, et al.
Published: (2024)
Agent-SafetyBench: Evaluating the Safety of LLM Agents
by: Zhang, Zhexin, et al.
Published: (2024)
by: Zhang, Zhexin, et al.
Published: (2024)
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!
by: Zhang, Zhexin, et al.
Published: (2025)
by: Zhang, Zhexin, et al.
Published: (2025)
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation
by: Ke, Pei, et al.
Published: (2023)
by: Ke, Pei, et al.
Published: (2023)
HoWToBench: Holistic Evaluation for LLM's Capability in Human-level Writing using Tree of Writing
by: Feng, Andrew Zhuoer, et al.
Published: (2026)
by: Feng, Andrew Zhuoer, et al.
Published: (2026)
Neural Task Synthesis for Visual Programming
by: Pădurean, Victor-Alexandru, et al.
Published: (2023)
by: Pădurean, Victor-Alexandru, et al.
Published: (2023)
ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback
by: Hou, Zhenyu, et al.
Published: (2024)
by: Hou, Zhenyu, et al.
Published: (2024)
QiMeng-Xpiler: Transcompiling Tensor Programs for Deep Learning Systems with a Neural-Symbolic Approach
by: Dong, Shouyang, et al.
Published: (2025)
by: Dong, Shouyang, et al.
Published: (2025)
CharacterBench: Benchmarking Character Customization of Large Language Models
by: Zhou, Jinfeng, et al.
Published: (2024)
by: Zhou, Jinfeng, et al.
Published: (2024)
When Smiley Turns Hostile: Interpreting How Emojis Trigger LLMs' Toxicity
by: Cui, Shiyao, et al.
Published: (2025)
by: Cui, Shiyao, et al.
Published: (2025)
Compile to Compress: Boosting Formal Theorem Provers by Compiler Outputs
by: Li, Guchan, et al.
Published: (2026)
by: Li, Guchan, et al.
Published: (2026)
Quokka: Accelerating Program Verification with LLMs via Invariant Synthesis
by: Wei, Anjiang, et al.
Published: (2025)
by: Wei, Anjiang, et al.
Published: (2025)
The Superalignment of Superhuman Intelligence with Large Language Models
by: Huang, Minlie, et al.
Published: (2024)
by: Huang, Minlie, et al.
Published: (2024)
C Analyzer : A Static Program Analysis Tool for C Programs
by: Solanki, Rajendra Kumar
Published: (2024)
by: Solanki, Rajendra Kumar
Published: (2024)
SocialSim: Towards Socialized Simulation of Emotional Support Conversation
by: Chen, Zhuang, et al.
Published: (2025)
by: Chen, Zhuang, et al.
Published: (2025)
Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
by: Wen, Zhihua, et al.
Published: (2024)
by: Wen, Zhihua, et al.
Published: (2024)
Hear Your Code Fail, Voice-Assisted Debugging for Python
by: Amiri, Sayed Mahbub Hasan, et al.
Published: (2025)
by: Amiri, Sayed Mahbub Hasan, et al.
Published: (2025)
AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition
by: Chen, Zhaorun, et al.
Published: (2024)
by: Chen, Zhaorun, et al.
Published: (2024)
Constrained Code Generation with Discrete Diffusion
by: Shao, Lize, et al.
Published: (2026)
by: Shao, Lize, et al.
Published: (2026)
Bridging the Knowledge Void: Inference-time Acquisition of Unfamiliar Programming Languages for Coding Tasks
by: Shen, Chen, et al.
Published: (2026)
by: Shen, Chen, et al.
Published: (2026)
RAVEL: Reasoning Agents for Validating and Evaluating LLM Text Synthesis
by: Feng, Andrew Zhuoer, et al.
Published: (2026)
by: Feng, Andrew Zhuoer, et al.
Published: (2026)
ShieldVLM: Safeguarding the Multimodal Implicit Toxicity via Deliberative Reasoning with LVLMs
by: Cui, Shiyao, et al.
Published: (2025)
by: Cui, Shiyao, et al.
Published: (2025)
Similar Items
-
Language Model Decoding as Direct Metrics Optimization
by: Ji, Haozhe, et al.
Published: (2023) -
Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning
by: Wen, Jiaxin, et al.
Published: (2024) -
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
by: Zhang, Zhexin, et al.
Published: (2023) -
Language Models Learn to Mislead Humans via RLHF
by: Wen, Jiaxin, et al.
Published: (2024) -
IF-RewardBench: Benchmarking Judge Models for Instruction-Following Evaluation
by: Wen, Bosi, et al.
Published: (2026)