Saved in:
Bibliographic Details
Main Authors: Fovet, Damien, Chamoli, Shashank, Oury, Sarah, Singhal, Srishti
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.08836
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913938815320064
author Fovet, Damien
Chamoli, Shashank
Oury, Sarah
Singhal, Srishti
author_facet Fovet, Damien
Chamoli, Shashank
Oury, Sarah
Singhal, Srishti
contents This study evaluates the performance of a compression method, called CompactifAI, developed by Multiverse Computing, applied to the large language model Llama 3.1 8B\cite{llama}. The evaluation focused on model efficiency (in terms of energy consumption) and accuracy using respectively the frameworks Codecarbon\cite{codecarbon} and Ragas\cite{ragas}. A comparison was performed between the model compressed with CompactifAI\cite{compactifai}\cite{compactifai2} and its full-size version. Our findings reveal that the compressed model using CompactifAI not only significantly reduced the computational resources but also maintained the model accuracy, making the model more efficient, scalable and cost-effective.
format Preprint
id arxiv_https___arxiv_org_abs_2507_08836
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Accuracy and Consumption analysis from a compressed model by CompactifAI from Multiverse Computing
Fovet, Damien
Chamoli, Shashank
Oury, Sarah
Singhal, Srishti
Machine Learning
Performance
This study evaluates the performance of a compression method, called CompactifAI, developed by Multiverse Computing, applied to the large language model Llama 3.1 8B\cite{llama}. The evaluation focused on model efficiency (in terms of energy consumption) and accuracy using respectively the frameworks Codecarbon\cite{codecarbon} and Ragas\cite{ragas}. A comparison was performed between the model compressed with CompactifAI\cite{compactifai}\cite{compactifai2} and its full-size version. Our findings reveal that the compressed model using CompactifAI not only significantly reduced the computational resources but also maintained the model accuracy, making the model more efficient, scalable and cost-effective.
title Accuracy and Consumption analysis from a compressed model by CompactifAI from Multiverse Computing
topic Machine Learning
Performance
url https://arxiv.org/abs/2507.08836