:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bai, Xueying, Shang, Jinghuan, Sun, Yifan, Balasubramanian, Niranjan
Format:	Preprint
Published:	2022
Subjects:	Computation and Language Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2205.12186
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Does RoBERTa Perform Better than BERT in Continual Learning: An Attention Sink Perspective
by: Bai, Xueying, et al.
Published: (2024)

Comparing Pre-trained Human Language Models: Is it Better with Human Context as Groups, Individual Traits, or Both?
by: Soni, Nikita, et al.
Published: (2024)

Large Human Language Models: A Need and the Challenges
by: Soni, Nikita, et al.
Published: (2023)

Evaluation of LLMs-based Hidden States as Author Representations for Psychological Human-Centered NLP Tasks
by: Soni, Nikita, et al.
Published: (2025)

Addressing the Ecological Fallacy in Larger LMs with Human Context
by: Soni, Nikita, et al.
Published: (2026)

AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
by: Trivedi, Harsh, et al.
Published: (2024)

The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm
by: Aakanksha, et al.
Published: (2024)

Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment
by: Zhang, Yifan, et al.
Published: (2024)

CP-Prompt: Composition-Based Cross-modal Prompting for Domain-Incremental Continual Learning
by: Feng, Yu, et al.
Published: (2024)

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
by: Yuan, Hui, et al.
Published: (2024)

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
by: Li, Xiang, et al.
Published: (2024)

Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models
by: Abbes, Istabrak, et al.
Published: (2025)

Aligner: Efficient Alignment by Learning to Correct
by: Ji, Jiaming, et al.
Published: (2024)

Sample-Efficient Alignment for LLMs
by: Liu, Zichen, et al.
Published: (2024)

Interpretable Safety Alignment via SAE-Constructed Low-Rank Subspace Adaptation
by: Wang, Dianyun, et al.
Published: (2025)

From Instructions to Constraints: Language Model Alignment with Automatic Constraint Verification
by: Wang, Fei, et al.
Published: (2024)

Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game
by: Cheng, Pengyu, et al.
Published: (2023)

Aligners: Decoupling LLMs and Alignment
by: Ngweta, Lilian, et al.
Published: (2024)

Towards Understanding Safety Alignment: A Mechanistic Perspective from Safety Neurons
by: Chen, Jianhui, et al.
Published: (2024)

Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models
by: Liu, Qin, et al.
Published: (2024)

AlignBench: Benchmarking Chinese Alignment of Large Language Models
by: Liu, Xiao, et al.
Published: (2023)

ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models
by: Wang, Siwei, et al.
Published: (2024)

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
by: Chuang, Yung-Sung, et al.
Published: (2025)

Reviving The Classics: Active Reward Modeling in Large Language Model Alignment
by: Shen, Yunyi, et al.
Published: (2025)

Enhancing Time Series Forecasting via Multi-Level Text Alignment with LLMs
by: Zhao, Taibiao, et al.
Published: (2025)

Self-Play Preference Optimization for Language Model Alignment
by: Wu, Yue, et al.
Published: (2024)

In-context Continual Learning Assisted by an External Continual Learner
by: Momeni, Saleh, et al.
Published: (2024)

Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment
by: Wang, Haowen, et al.
Published: (2025)

SALMON: Self-Alignment with Instructable Reward Models
by: Sun, Zhiqing, et al.
Published: (2023)

DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
by: Chen, Yilong, et al.
Published: (2024)

Contrastive Reasoning Alignment: Reinforcement Learning from Hidden Representations
by: Luo, Haozheng, et al.
Published: (2026)

Inverse Reinforcement Learning with Dynamic Reward Scaling for LLM Alignment
by: Cheng, Ruoxi, et al.
Published: (2025)

Can AI-Generated Text be Reliably Detected?
by: Sadasivan, Vinu Sankar, et al.
Published: (2023)

Efficient Model Development through Fine-tuning Transfer
by: Lin, Pin-Jie, et al.
Published: (2025)

Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
by: Sun, Zhiqing, et al.
Published: (2024)

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
by: Qiu, Jiahao, et al.
Published: (2024)

Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences
by: Pattnaik, Pulkit, et al.
Published: (2024)

Human Inspired Progressive Alignment and Comparative Learning for Grounded Word Acquisition
by: Bao, Yuwei, et al.
Published: (2023)

Reformatted Alignment
by: Fan, Run-Ze, et al.
Published: (2024)

Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective
by: Wang, Siwei, et al.
Published: (2025)