Saved in:
| Main Authors: | Bai, Xueying, Shang, Jinghuan, Sun, Yifan, Balasubramanian, Niranjan |
|---|---|
| Format: | Preprint |
| Published: |
2022
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2205.12186 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Does RoBERTa Perform Better than BERT in Continual Learning: An Attention Sink Perspective
by: Bai, Xueying, et al.
Published: (2024)
by: Bai, Xueying, et al.
Published: (2024)
Comparing Pre-trained Human Language Models: Is it Better with Human Context as Groups, Individual Traits, or Both?
by: Soni, Nikita, et al.
Published: (2024)
by: Soni, Nikita, et al.
Published: (2024)
Large Human Language Models: A Need and the Challenges
by: Soni, Nikita, et al.
Published: (2023)
by: Soni, Nikita, et al.
Published: (2023)
Evaluation of LLMs-based Hidden States as Author Representations for Psychological Human-Centered NLP Tasks
by: Soni, Nikita, et al.
Published: (2025)
by: Soni, Nikita, et al.
Published: (2025)
Addressing the Ecological Fallacy in Larger LMs with Human Context
by: Soni, Nikita, et al.
Published: (2026)
by: Soni, Nikita, et al.
Published: (2026)
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
by: Trivedi, Harsh, et al.
Published: (2024)
by: Trivedi, Harsh, et al.
Published: (2024)
The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm
by: Aakanksha, et al.
Published: (2024)
by: Aakanksha, et al.
Published: (2024)
Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment
by: Zhang, Yifan, et al.
Published: (2024)
by: Zhang, Yifan, et al.
Published: (2024)
CP-Prompt: Composition-Based Cross-modal Prompting for Domain-Incremental Continual Learning
by: Feng, Yu, et al.
Published: (2024)
by: Feng, Yu, et al.
Published: (2024)
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
by: Yuan, Hui, et al.
Published: (2024)
by: Yuan, Hui, et al.
Published: (2024)
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
by: Li, Xiang, et al.
Published: (2024)
by: Li, Xiang, et al.
Published: (2024)
Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models
by: Abbes, Istabrak, et al.
Published: (2025)
by: Abbes, Istabrak, et al.
Published: (2025)
Aligner: Efficient Alignment by Learning to Correct
by: Ji, Jiaming, et al.
Published: (2024)
by: Ji, Jiaming, et al.
Published: (2024)
Sample-Efficient Alignment for LLMs
by: Liu, Zichen, et al.
Published: (2024)
by: Liu, Zichen, et al.
Published: (2024)
Interpretable Safety Alignment via SAE-Constructed Low-Rank Subspace Adaptation
by: Wang, Dianyun, et al.
Published: (2025)
by: Wang, Dianyun, et al.
Published: (2025)
From Instructions to Constraints: Language Model Alignment with Automatic Constraint Verification
by: Wang, Fei, et al.
Published: (2024)
by: Wang, Fei, et al.
Published: (2024)
Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game
by: Cheng, Pengyu, et al.
Published: (2023)
by: Cheng, Pengyu, et al.
Published: (2023)
Aligners: Decoupling LLMs and Alignment
by: Ngweta, Lilian, et al.
Published: (2024)
by: Ngweta, Lilian, et al.
Published: (2024)
Towards Understanding Safety Alignment: A Mechanistic Perspective from Safety Neurons
by: Chen, Jianhui, et al.
Published: (2024)
by: Chen, Jianhui, et al.
Published: (2024)
Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models
by: Liu, Qin, et al.
Published: (2024)
by: Liu, Qin, et al.
Published: (2024)
AlignBench: Benchmarking Chinese Alignment of Large Language Models
by: Liu, Xiao, et al.
Published: (2023)
by: Liu, Xiao, et al.
Published: (2023)
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models
by: Wang, Siwei, et al.
Published: (2024)
by: Wang, Siwei, et al.
Published: (2024)
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
by: Chuang, Yung-Sung, et al.
Published: (2025)
by: Chuang, Yung-Sung, et al.
Published: (2025)
Reviving The Classics: Active Reward Modeling in Large Language Model Alignment
by: Shen, Yunyi, et al.
Published: (2025)
by: Shen, Yunyi, et al.
Published: (2025)
Enhancing Time Series Forecasting via Multi-Level Text Alignment with LLMs
by: Zhao, Taibiao, et al.
Published: (2025)
by: Zhao, Taibiao, et al.
Published: (2025)
Self-Play Preference Optimization for Language Model Alignment
by: Wu, Yue, et al.
Published: (2024)
by: Wu, Yue, et al.
Published: (2024)
In-context Continual Learning Assisted by an External Continual Learner
by: Momeni, Saleh, et al.
Published: (2024)
by: Momeni, Saleh, et al.
Published: (2024)
Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment
by: Wang, Haowen, et al.
Published: (2025)
by: Wang, Haowen, et al.
Published: (2025)
SALMON: Self-Alignment with Instructable Reward Models
by: Sun, Zhiqing, et al.
Published: (2023)
by: Sun, Zhiqing, et al.
Published: (2023)
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
by: Chen, Yilong, et al.
Published: (2024)
by: Chen, Yilong, et al.
Published: (2024)
Contrastive Reasoning Alignment: Reinforcement Learning from Hidden Representations
by: Luo, Haozheng, et al.
Published: (2026)
by: Luo, Haozheng, et al.
Published: (2026)
Inverse Reinforcement Learning with Dynamic Reward Scaling for LLM Alignment
by: Cheng, Ruoxi, et al.
Published: (2025)
by: Cheng, Ruoxi, et al.
Published: (2025)
Can AI-Generated Text be Reliably Detected?
by: Sadasivan, Vinu Sankar, et al.
Published: (2023)
by: Sadasivan, Vinu Sankar, et al.
Published: (2023)
Efficient Model Development through Fine-tuning Transfer
by: Lin, Pin-Jie, et al.
Published: (2025)
by: Lin, Pin-Jie, et al.
Published: (2025)
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
by: Sun, Zhiqing, et al.
Published: (2024)
by: Sun, Zhiqing, et al.
Published: (2024)
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
by: Qiu, Jiahao, et al.
Published: (2024)
by: Qiu, Jiahao, et al.
Published: (2024)
Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences
by: Pattnaik, Pulkit, et al.
Published: (2024)
by: Pattnaik, Pulkit, et al.
Published: (2024)
Human Inspired Progressive Alignment and Comparative Learning for Grounded Word Acquisition
by: Bao, Yuwei, et al.
Published: (2023)
by: Bao, Yuwei, et al.
Published: (2023)
Reformatted Alignment
by: Fan, Run-Ze, et al.
Published: (2024)
by: Fan, Run-Ze, et al.
Published: (2024)
Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective
by: Wang, Siwei, et al.
Published: (2025)
by: Wang, Siwei, et al.
Published: (2025)
Similar Items
-
Does RoBERTa Perform Better than BERT in Continual Learning: An Attention Sink Perspective
by: Bai, Xueying, et al.
Published: (2024) -
Comparing Pre-trained Human Language Models: Is it Better with Human Context as Groups, Individual Traits, or Both?
by: Soni, Nikita, et al.
Published: (2024) -
Large Human Language Models: A Need and the Challenges
by: Soni, Nikita, et al.
Published: (2023) -
Evaluation of LLMs-based Hidden States as Author Representations for Psychological Human-Centered NLP Tasks
by: Soni, Nikita, et al.
Published: (2025) -
Addressing the Ecological Fallacy in Larger LMs with Human Context
by: Soni, Nikita, et al.
Published: (2026)