Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.08836 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866913938815320064 |
|---|---|
| author | Fovet, Damien Chamoli, Shashank Oury, Sarah Singhal, Srishti |
| author_facet | Fovet, Damien Chamoli, Shashank Oury, Sarah Singhal, Srishti |
| contents | This study evaluates the performance of a compression method, called CompactifAI, developed by Multiverse Computing, applied to the large language model Llama 3.1 8B\cite{llama}. The evaluation focused on model efficiency (in terms of energy consumption) and accuracy using respectively the frameworks Codecarbon\cite{codecarbon} and Ragas\cite{ragas}. A comparison was performed between the model compressed with CompactifAI\cite{compactifai}\cite{compactifai2} and its full-size version. Our findings reveal that the compressed model using CompactifAI not only significantly reduced the computational resources but also maintained the model accuracy, making the model more efficient, scalable and cost-effective. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2507_08836 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Accuracy and Consumption analysis from a compressed model by CompactifAI from Multiverse Computing Fovet, Damien Chamoli, Shashank Oury, Sarah Singhal, Srishti Machine Learning Performance This study evaluates the performance of a compression method, called CompactifAI, developed by Multiverse Computing, applied to the large language model Llama 3.1 8B\cite{llama}. The evaluation focused on model efficiency (in terms of energy consumption) and accuracy using respectively the frameworks Codecarbon\cite{codecarbon} and Ragas\cite{ragas}. A comparison was performed between the model compressed with CompactifAI\cite{compactifai}\cite{compactifai2} and its full-size version. Our findings reveal that the compressed model using CompactifAI not only significantly reduced the computational resources but also maintained the model accuracy, making the model more efficient, scalable and cost-effective. |
| title | Accuracy and Consumption analysis from a compressed model by CompactifAI from Multiverse Computing |
| topic | Machine Learning Performance |
| url | https://arxiv.org/abs/2507.08836 |