Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Amini, Alexander, Banaszak, Anna, Benoit, Harold, Böök, Arthur, Dakhran, Tarek, Duong, Song, Eng, Alfred, Fernandes, Fernando, Härkönen, Marc, Harrington, Anne, Hasani, Ramin, Karwa, Saniya, Khrustalev, Yuri, Labonne, Maxime, Lechner, Mathias, Lechner, Valentine, Lee, Simon, Li, Zetian, Loo, Noel, Marks, Jacob, Mosca, Edoardo, Paech, Samuel J., Pak, Paul, Parnichkun, Rom N., Quach, Alex, Rogers, Ryan, Rus, Daniela, Saxena, Nayan, Schlager, Bettina, Seyde, Tim, Smith, Jimmy T. H., Tadimeti, Aditya, Tumma, Neehal
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2511.23404
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911292510437376
author	Amini, Alexander Banaszak, Anna Benoit, Harold Böök, Arthur Dakhran, Tarek Duong, Song Eng, Alfred Fernandes, Fernando Härkönen, Marc Harrington, Anne Hasani, Ramin Karwa, Saniya Khrustalev, Yuri Labonne, Maxime Lechner, Mathias Lechner, Valentine Lee, Simon Li, Zetian Loo, Noel Marks, Jacob Mosca, Edoardo Paech, Samuel J. Pak, Paul Parnichkun, Rom N. Quach, Alex Rogers, Ryan Rus, Daniela Saxena, Nayan Schlager, Bettina Seyde, Tim Smith, Jimmy T. H. Tadimeti, Aditya Tumma, Neehal
author_facet	Amini, Alexander Banaszak, Anna Benoit, Harold Böök, Arthur Dakhran, Tarek Duong, Song Eng, Alfred Fernandes, Fernando Härkönen, Marc Harrington, Anne Hasani, Ramin Karwa, Saniya Khrustalev, Yuri Labonne, Maxime Lechner, Mathias Lechner, Valentine Lee, Simon Li, Zetian Loo, Noel Marks, Jacob Mosca, Edoardo Paech, Samuel J. Pak, Paul Parnichkun, Rom N. Quach, Alex Rogers, Ryan Rus, Daniela Saxena, Nayan Schlager, Bettina Seyde, Tim Smith, Jimmy T. H. Tadimeti, Aditya Tumma, Neehal
contents	We present LFM2, a family of Liquid Foundation Models designed for efficient on-device deployment and strong task capabilities. Using hardware-in-the-loop architecture search under edge latency and memory constraints, we obtain a compact hybrid backbone that combines gated short convolutions with a small number of grouped query attention blocks, delivering up to 2x faster prefill and decode on CPUs compared to similarly sized models. The LFM2 family covers 350M-8.3B parameters, including dense models (350M, 700M, 1.2B, 2.6B) and a mixture-of-experts variant (8.3B total, 1.5B active), all with 32K context length. LFM2's training pipeline includes a tempered, decoupled Top-K knowledge distillation objective that avoids support mismatch; curriculum learning with difficulty-ordered data; and a three-stage post-training recipe of supervised fine-tuning, length-normalized preference optimization, and model merging. Pre-trained on 10-12T tokens, LFM2 models achieve strong results across diverse benchmarks; for example, LFM2-2.6B reaches 79.56% on IFEval and 82.41% on GSM8K. We further build multimodal and retrieval variants: LFM2-VL for vision-language tasks, LFM2-Audio for speech, and LFM2-ColBERT for retrieval. LFM2-VL supports tunable accuracy-latency tradeoffs via token-efficient visual processing, while LFM2-Audio separates audio input and output pathways to enable real-time speech-to-speech interaction competitive with models 3x larger. LFM2-ColBERT provides a low-latency encoder for queries and documents, enabling high-performance retrieval across multiple languages. All models are released with open weights and deployment packages for ExecuTorch, llama.cpp, and vLLM, making LFM2 a practical base for edge applications that need fast, memory-efficient inference and strong task capabilities.
format	Preprint
id	arxiv_https___arxiv_org_abs_2511_23404
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	LFM2 Technical Report Amini, Alexander Banaszak, Anna Benoit, Harold Böök, Arthur Dakhran, Tarek Duong, Song Eng, Alfred Fernandes, Fernando Härkönen, Marc Harrington, Anne Hasani, Ramin Karwa, Saniya Khrustalev, Yuri Labonne, Maxime Lechner, Mathias Lechner, Valentine Lee, Simon Li, Zetian Loo, Noel Marks, Jacob Mosca, Edoardo Paech, Samuel J. Pak, Paul Parnichkun, Rom N. Quach, Alex Rogers, Ryan Rus, Daniela Saxena, Nayan Schlager, Bettina Seyde, Tim Smith, Jimmy T. H. Tadimeti, Aditya Tumma, Neehal Machine Learning Artificial Intelligence We present LFM2, a family of Liquid Foundation Models designed for efficient on-device deployment and strong task capabilities. Using hardware-in-the-loop architecture search under edge latency and memory constraints, we obtain a compact hybrid backbone that combines gated short convolutions with a small number of grouped query attention blocks, delivering up to 2x faster prefill and decode on CPUs compared to similarly sized models. The LFM2 family covers 350M-8.3B parameters, including dense models (350M, 700M, 1.2B, 2.6B) and a mixture-of-experts variant (8.3B total, 1.5B active), all with 32K context length. LFM2's training pipeline includes a tempered, decoupled Top-K knowledge distillation objective that avoids support mismatch; curriculum learning with difficulty-ordered data; and a three-stage post-training recipe of supervised fine-tuning, length-normalized preference optimization, and model merging. Pre-trained on 10-12T tokens, LFM2 models achieve strong results across diverse benchmarks; for example, LFM2-2.6B reaches 79.56% on IFEval and 82.41% on GSM8K. We further build multimodal and retrieval variants: LFM2-VL for vision-language tasks, LFM2-Audio for speech, and LFM2-ColBERT for retrieval. LFM2-VL supports tunable accuracy-latency tradeoffs via token-efficient visual processing, while LFM2-Audio separates audio input and output pathways to enable real-time speech-to-speech interaction competitive with models 3x larger. LFM2-ColBERT provides a low-latency encoder for queries and documents, enabling high-performance retrieval across multiple languages. All models are released with open weights and deployment packages for ExecuTorch, llama.cpp, and vLLM, making LFM2 a practical base for edge applications that need fast, memory-efficient inference and strong task capabilities.
title	LFM2 Technical Report
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2511.23404

Similar Items