Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Cheng, Junyan, Clark, Peter, Richardson, Kyle
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence Computation and Language Multiagent Systems
Online Access:	https://arxiv.org/abs/2506.20249
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916810765369344
author	Cheng, Junyan Clark, Peter Richardson, Kyle
author_facet	Cheng, Junyan Clark, Peter Richardson, Kyle
contents	Can we leverage LLMs to model the process of discovering novel language model (LM) architectures? Inspired by real research, we propose a multi-agent LLM approach that simulates the conventional stages of research, from ideation and literature search (proposal stage) to design implementation (code generation), generative pre-training, and downstream evaluation (verification). Using ideas from scaling laws, our system, Genesys, employs a Ladder of Scales approach; new designs are proposed, adversarially reviewed, implemented, and selectively verified at increasingly larger model scales (14M$\sim$350M parameters) with a narrowing budget (the number of models we can train at each scale). To help make discovery efficient and factorizable, Genesys uses a novel genetic programming backbone, which we show has empirical advantages over commonly used direct prompt generation workflows (e.g., $\sim$86\% percentage point improvement in successful design generation, a key bottleneck). We report experiments involving 1,162 newly discovered designs (1,062 fully verified through pre-training) and find the best designs to be highly competitive with known architectures (e.g., outperform GPT2, Mamba2, etc., on 6/9 common benchmarks). We couple these results with comprehensive system-level ablations and formal results, which give broader insights into the design of effective autonomous discovery systems.
format	Preprint
id	arxiv_https___arxiv_org_abs_2506_20249
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Language Modeling by Language Models Cheng, Junyan Clark, Peter Richardson, Kyle Artificial Intelligence Computation and Language Multiagent Systems Can we leverage LLMs to model the process of discovering novel language model (LM) architectures? Inspired by real research, we propose a multi-agent LLM approach that simulates the conventional stages of research, from ideation and literature search (proposal stage) to design implementation (code generation), generative pre-training, and downstream evaluation (verification). Using ideas from scaling laws, our system, Genesys, employs a Ladder of Scales approach; new designs are proposed, adversarially reviewed, implemented, and selectively verified at increasingly larger model scales (14M$\sim$350M parameters) with a narrowing budget (the number of models we can train at each scale). To help make discovery efficient and factorizable, Genesys uses a novel genetic programming backbone, which we show has empirical advantages over commonly used direct prompt generation workflows (e.g., $\sim$86\% percentage point improvement in successful design generation, a key bottleneck). We report experiments involving 1,162 newly discovered designs (1,062 fully verified through pre-training) and find the best designs to be highly competitive with known architectures (e.g., outperform GPT2, Mamba2, etc., on 6/9 common benchmarks). We couple these results with comprehensive system-level ablations and formal results, which give broader insights into the design of effective autonomous discovery systems.
title	Language Modeling by Language Models
topic	Artificial Intelligence Computation and Language Multiagent Systems
url	https://arxiv.org/abs/2506.20249

Similar Items