Uloženo v:
| Hlavní autor: | |
|---|---|
| Médium: | Recurso digital |
| Jazyk: | angličtina |
| Vydáno: |
Zenodo
2026
|
| Témata: | |
| On-line přístup: | https://doi.org/10.5281/zenodo.20321300 |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
Obsah:
- <p><em><span>Large Language Models (LLMs) such as GPT-3 and BERT have significantly improved the performance of natural language processing applications, including chatbots, machine translation, content generation, and virtual assistants. Despite their high accuracy and advanced language understanding capabilities, these models require substantial computational resources, including powerful GPUs, large memory capacity, and high energy consumption. Such requirements make the deployment of LLMs difficult in resource-constrained environments such as mobile devices, Internet of Things (IoT) systems, embedded platforms, and edge computing infrastructures.</span></em></p> <p><em><span>This study focuses on improving the computational efficiency of Large Language Models while maintaining acceptable performance levels. The research examines the major challenges associated with deploying LLMs in low-resource environments and reviews common optimization techniques such as model compression, pruning, quantization, knowledge distillation, and parameter-efficient fine-tuning. The study also explores the trade-off between model accuracy and computational efficiency and highlights the importance of lightweight and scalable AI solutions for edge computing applications.</span></em></p> <p><em><span>The findings suggest that optimization methods can significantly reduce model size, inference time, memory usage, and energy consumption, making LLMs more practical for real-world deployment on low-power devices. However, balancing efficiency and performance remains a major challenge, as excessive reduction in computational requirements may negatively affect model accuracy. The study concludes by recommending practical strategies for developing efficient, accessible, and scalable LLMs suitable for diverse resource-constrained environments.</span></em></p>