Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Hu, Yuxuan, Liu, Jihao, Wang, Ke, Zhen, Jinliang, Shi, Weikang, Zhang, Manyuan, Dou, Qi, Liu, Rui, Zhou, Aojun, Li, Hongsheng
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2509.05657
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916969086713856
author	Hu, Yuxuan Liu, Jihao Wang, Ke Zhen, Jinliang Shi, Weikang Zhang, Manyuan Dou, Qi Liu, Rui Zhou, Aojun Li, Hongsheng
author_facet	Hu, Yuxuan Liu, Jihao Wang, Ke Zhen, Jinliang Shi, Weikang Zhang, Manyuan Dou, Qi Liu, Rui Zhou, Aojun Li, Hongsheng
contents	Recent progress in Large Language Models (LLMs) has opened new avenues for solving complex optimization problems, including Neural Architecture Search (NAS). However, existing LLM-driven NAS approaches rely heavily on prompt engineering and domain-specific tuning, limiting their practicality and scalability across diverse tasks. In this work, we propose LM-Searcher, a novel framework that leverages LLMs for cross-domain neural architecture optimization without the need for extensive domain-specific adaptation. Central to our approach is NCode, a universal numerical string representation for neural architectures, which enables cross-domain architecture encoding and search. We also reformulate the NAS problem as a ranking task, training LLMs to select high-performing architectures from candidate pools using instruction-tuning samples derived from a novel pruning-based subspace sampling strategy. Our curated dataset, encompassing a wide range of architecture-performance pairs, encourages robust and transferable learning. Comprehensive experiments demonstrate that LM-Searcher achieves competitive performance in both in-domain (e.g., CNNs for image classification) and out-of-domain (e.g., LoRA configurations for segmentation and generation) tasks, establishing a new paradigm for flexible and generalizable LLM-based architecture search. The datasets and models will be released at https://github.com/Ashone3/LM-Searcher.
format	Preprint
id	arxiv_https___arxiv_org_abs_2509_05657
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding Hu, Yuxuan Liu, Jihao Wang, Ke Zhen, Jinliang Shi, Weikang Zhang, Manyuan Dou, Qi Liu, Rui Zhou, Aojun Li, Hongsheng Computation and Language Artificial Intelligence Recent progress in Large Language Models (LLMs) has opened new avenues for solving complex optimization problems, including Neural Architecture Search (NAS). However, existing LLM-driven NAS approaches rely heavily on prompt engineering and domain-specific tuning, limiting their practicality and scalability across diverse tasks. In this work, we propose LM-Searcher, a novel framework that leverages LLMs for cross-domain neural architecture optimization without the need for extensive domain-specific adaptation. Central to our approach is NCode, a universal numerical string representation for neural architectures, which enables cross-domain architecture encoding and search. We also reformulate the NAS problem as a ranking task, training LLMs to select high-performing architectures from candidate pools using instruction-tuning samples derived from a novel pruning-based subspace sampling strategy. Our curated dataset, encompassing a wide range of architecture-performance pairs, encourages robust and transferable learning. Comprehensive experiments demonstrate that LM-Searcher achieves competitive performance in both in-domain (e.g., CNNs for image classification) and out-of-domain (e.g., LoRA configurations for segmentation and generation) tasks, establishing a new paradigm for flexible and generalizable LLM-based architecture search. The datasets and models will be released at https://github.com/Ashone3/LM-Searcher.
title	LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2509.05657

Similar Items