Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Letellier, Guillaume, Srivastava, Siddharth, Jurie, Frédéric, Sharma, Gaurav
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning Neural and Evolutionary Computing
Online Access:	https://arxiv.org/abs/2511.20721
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915892116324352
author	Letellier, Guillaume Srivastava, Siddharth Jurie, Frédéric Sharma, Gaurav
author_facet	Letellier, Guillaume Srivastava, Siddharth Jurie, Frédéric Sharma, Gaurav
contents	Foundation models pre-trained with self-supervised learning (SSL) on large-scale datasets have become powerful general-purpose feature extractors. However, their immense size and computational cost make them prohibitive for deployment on edge devices such as robots and AR/VR headsets. Existing compression techniques like standard knowledge distillation create efficient 'specialist' models but sacrifice the crucial, downstream-agnostic generality that makes foundation models so valuable. In this paper, we introduce Foundation Model Distillation (FMD), a new paradigm for compressing large SSL models into compact, efficient, and faithful proxies that retain their general-purpose representational power. We present Foundry, the first implementation of FMD for 3D point clouds. Our approach, Foundry, trains a student to learn a compressed set of SuperTokens that reconstruct the teacher's token-level representations, capturing a compact basis of its latent space. A single distilled model maintains strong transferability across diverse downstream tasks-classification, part segmentation, and few-shot scenarios-approaching full foundation-model performance while using significantly fewer tokens and FLOPs, making such models more practical for deployment on resourceconstrained hardware.
format	Preprint
id	arxiv_https___arxiv_org_abs_2511_20721
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Foundry: Distilling 3D Foundation Models for the Edge Letellier, Guillaume Srivastava, Siddharth Jurie, Frédéric Sharma, Gaurav Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning Neural and Evolutionary Computing Foundation models pre-trained with self-supervised learning (SSL) on large-scale datasets have become powerful general-purpose feature extractors. However, their immense size and computational cost make them prohibitive for deployment on edge devices such as robots and AR/VR headsets. Existing compression techniques like standard knowledge distillation create efficient 'specialist' models but sacrifice the crucial, downstream-agnostic generality that makes foundation models so valuable. In this paper, we introduce Foundation Model Distillation (FMD), a new paradigm for compressing large SSL models into compact, efficient, and faithful proxies that retain their general-purpose representational power. We present Foundry, the first implementation of FMD for 3D point clouds. Our approach, Foundry, trains a student to learn a compressed set of SuperTokens that reconstruct the teacher's token-level representations, capturing a compact basis of its latent space. A single distilled model maintains strong transferability across diverse downstream tasks-classification, part segmentation, and few-shot scenarios-approaching full foundation-model performance while using significantly fewer tokens and FLOPs, making such models more practical for deployment on resourceconstrained hardware.
title	Foundry: Distilling 3D Foundation Models for the Edge
topic	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning Neural and Evolutionary Computing
url	https://arxiv.org/abs/2511.20721

Similar Items