Saved in:
Bibliographic Details
Main Authors: Wang, Olivia, Khir, Reem
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.23889
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910195097010176
author Wang, Olivia
Khir, Reem
author_facet Wang, Olivia
Khir, Reem
contents Column generation is a widely used decomposition technique for large-scale linear programs, but it often suffers from slow convergence due to poor initial dual estimates and dual oscillations. Stabilization techniques such as smoothing and penalization can mitigate these issues, but their effectiveness depends heavily on parameter selection, which requires careful tuning to avoid degrading performance. This paper presents a common framework for smoothing and penalization, showing that despite their different mechanisms, both are governed by two design choices: a reference point in the dual space and stabilization parameters that regulate how strongly that reference influences pricing. Within this framework, we derive parameter bounds that ensure progress, analyze predicted duals as reference points, and establish convergence guarantees for both methods. These results motivate and guide the design of RLSCG, a reinforcement learning-guided framework that adaptively selects stabilization parameters at each iteration. Computational experiments on the Cutting Stock Problem show that RLSCG substantially reduces iteration count and computation time on most synthetic and benchmark instances relative to traditional column generation, rule-based adaptive stabilization, and learning-based column selection, with the largest gains on large-scale instances.
format Preprint
id arxiv_https___arxiv_org_abs_2604_23889
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Learning to Control Stabilization in Column Generation
Wang, Olivia
Khir, Reem
Optimization and Control
Column generation is a widely used decomposition technique for large-scale linear programs, but it often suffers from slow convergence due to poor initial dual estimates and dual oscillations. Stabilization techniques such as smoothing and penalization can mitigate these issues, but their effectiveness depends heavily on parameter selection, which requires careful tuning to avoid degrading performance. This paper presents a common framework for smoothing and penalization, showing that despite their different mechanisms, both are governed by two design choices: a reference point in the dual space and stabilization parameters that regulate how strongly that reference influences pricing. Within this framework, we derive parameter bounds that ensure progress, analyze predicted duals as reference points, and establish convergence guarantees for both methods. These results motivate and guide the design of RLSCG, a reinforcement learning-guided framework that adaptively selects stabilization parameters at each iteration. Computational experiments on the Cutting Stock Problem show that RLSCG substantially reduces iteration count and computation time on most synthetic and benchmark instances relative to traditional column generation, rule-based adaptive stabilization, and learning-based column selection, with the largest gains on large-scale instances.
title Learning to Control Stabilization in Column Generation
topic Optimization and Control
url https://arxiv.org/abs/2604.23889