Saved in:
| Main Authors: | Shen, Yikang, Guo, Zhen, Cai, Tianle, Qin, Zengyi |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.07413 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
by: Guo, Zhen, et al.
Published: (2024)
by: Guo, Zhen, et al.
Published: (2024)
MGH Radiology Llama: A Llama 3 70B Model for Radiology
by: Shi, Yucheng, et al.
Published: (2024)
by: Shi, Yucheng, et al.
Published: (2024)
Synthetic Data RL: Task Definition Is All You Need
by: Guo, Yiduo, et al.
Published: (2025)
by: Guo, Yiduo, et al.
Published: (2025)
BURExtract-Llama: An LLM for Clinical Concept Extraction in Breast Ultrasound Reports
by: Chen, Yuxuan, et al.
Published: (2024)
by: Chen, Yuxuan, et al.
Published: (2024)
Training-Free Activation Sparsity in Large Language Models
by: Liu, James, et al.
Published: (2024)
by: Liu, James, et al.
Published: (2024)
Investigating Bias Representations in Llama 2 Chat via Activation Steering
by: Lu, Dawn, et al.
Published: (2024)
by: Lu, Dawn, et al.
Published: (2024)
Open Llama2 Model for the Lithuanian Language
by: Nakvosas, Artūras, et al.
Published: (2024)
by: Nakvosas, Artūras, et al.
Published: (2024)
Benchmarking Linguistic Adaptation in Comparable-Sized LLMs: A Study of Llama-3.1-8B, Mistral-7B-v0.1, and Qwen3-8B on Romanized Nepali
by: Rimal, Ananda, et al.
Published: (2026)
by: Rimal, Ananda, et al.
Published: (2026)
Applying Refusal-Vector Ablation to Llama 3.1 70B Agents
by: Lermen, Simon, et al.
Published: (2024)
by: Lermen, Simon, et al.
Published: (2024)
TinyLlama: An Open-Source Small Language Model
by: Zhang, Peiyuan, et al.
Published: (2024)
by: Zhang, Peiyuan, et al.
Published: (2024)
The Llama 3 Herd of Models
by: Grattafiori, Aaron, et al.
Published: (2024)
by: Grattafiori, Aaron, et al.
Published: (2024)
Steering Llama 2 via Contrastive Activation Addition
by: Panickssery, Nina, et al.
Published: (2023)
by: Panickssery, Nina, et al.
Published: (2023)
ReachAgent: Enhancing Mobile Agent via Page Reaching and Operation
by: Wu, Qinzhuo, et al.
Published: (2025)
by: Wu, Qinzhuo, et al.
Published: (2025)
Llama-Nemotron: Efficient Reasoning Models
by: Bercovich, Akhiad, et al.
Published: (2025)
by: Bercovich, Akhiad, et al.
Published: (2025)
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
by: Du, Wenyu, et al.
Published: (2024)
by: Du, Wenyu, et al.
Published: (2024)
Open-SQL Framework: Enhancing Text-to-SQL on Open-source Large Language Models
by: Chen, Xiaojun, et al.
Published: (2024)
by: Chen, Xiaojun, et al.
Published: (2024)
GreedLlama: Performance of Financial Value-Aligned Large Language Models in Moral Reasoning
by: Yu, Jeffy, et al.
Published: (2024)
by: Yu, Jeffy, et al.
Published: (2024)
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
by: Zheng, Yaowei, et al.
Published: (2024)
by: Zheng, Yaowei, et al.
Published: (2024)
AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs
by: Fathullah, Yassir, et al.
Published: (2023)
by: Fathullah, Yassir, et al.
Published: (2023)
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment
by: Agarwalla, Abhinav, et al.
Published: (2024)
by: Agarwalla, Abhinav, et al.
Published: (2024)
LLaDA-MoE: A Sparse MoE Diffusion Language Model
by: Zhu, Fengqi, et al.
Published: (2025)
by: Zhu, Fengqi, et al.
Published: (2025)
BanglaLlama: LLaMA for Bangla Language
by: Zehady, Abdullah Khan, et al.
Published: (2024)
by: Zehady, Abdullah Khan, et al.
Published: (2024)
See or Say Graphs: Agent-Driven Scalable Graph Structure Understanding with Vision-Language Models
by: Han, Shuo, et al.
Published: (2025)
by: Han, Shuo, et al.
Published: (2025)
Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
by: Tan, Shawn, et al.
Published: (2024)
by: Tan, Shawn, et al.
Published: (2024)
TableLlama: Towards Open Large Generalist Models for Tables
by: Zhang, Tianshu, et al.
Published: (2023)
by: Zhang, Tianshu, et al.
Published: (2023)
Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data
by: Li, Yunxin, et al.
Published: (2025)
by: Li, Yunxin, et al.
Published: (2025)
OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale
by: Shi, Jingze, et al.
Published: (2026)
by: Shi, Jingze, et al.
Published: (2026)
Sigma-MoE-Tiny Technical Report
by: Hu, Qingguo, et al.
Published: (2025)
by: Hu, Qingguo, et al.
Published: (2025)
Yuan3.0 Ultra: A Trillion-Parameter Enterprise-Oriented MoE LLM
by: ai, YuanLab., et al.
Published: (2026)
by: ai, YuanLab., et al.
Published: (2026)
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
by: Qin, Zhen, et al.
Published: (2024)
by: Qin, Zhen, et al.
Published: (2024)
Forbidden Facts: An Investigation of Competing Objectives in Llama-2
by: Wang, Tony T., et al.
Published: (2023)
by: Wang, Tony T., et al.
Published: (2023)
LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing
by: Sun, Maojun
Published: (2024)
by: Sun, Maojun
Published: (2024)
Assessing the Ability of Neural TTS Systems to Model Consonant-Induced F0 Perturbation
by: Yang, Tianle, et al.
Published: (2026)
by: Yang, Tianle, et al.
Published: (2026)
Finding the Minimal Parameter Budget for Implicit Reasoning: A Data Complexity Driven Scaling Law for Language Models
by: Wang, Xinyi, et al.
Published: (2025)
by: Wang, Xinyi, et al.
Published: (2025)
A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2
by: Pietroń, Marcin, et al.
Published: (2026)
by: Pietroń, Marcin, et al.
Published: (2026)
LlamaTurk: Adapting Open-Source Generative Large Language Models for Low-Resource Language
by: Toraman, Cagri
Published: (2024)
by: Toraman, Cagri
Published: (2024)
Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian
by: Hoffmann, Michael, et al.
Published: (2025)
by: Hoffmann, Michael, et al.
Published: (2025)
LEGO-GraphRAG: Modularizing Graph-based Retrieval-Augmented Generation for Design Space Exploration
by: Cao, Yukun, et al.
Published: (2024)
by: Cao, Yukun, et al.
Published: (2024)
Fragile Knowledge, Robust Instruction-Following: The Width Pruning Dichotomy in Llama-3.2
by: Martra, Pere
Published: (2025)
by: Martra, Pere
Published: (2025)
Parameter Efficient Fine Tuning Llama 3.1 for Answering Arabic Legal Questions: A Case Study on Jordanian Laws
by: Fasha, Mohammed, et al.
Published: (2026)
by: Fasha, Mohammed, et al.
Published: (2026)
Similar Items
-
API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
by: Guo, Zhen, et al.
Published: (2024) -
MGH Radiology Llama: A Llama 3 70B Model for Radiology
by: Shi, Yucheng, et al.
Published: (2024) -
Synthetic Data RL: Task Definition Is All You Need
by: Guo, Yiduo, et al.
Published: (2025) -
BURExtract-Llama: An LLM for Clinical Concept Extraction in Breast Ultrasound Reports
by: Chen, Yuxuan, et al.
Published: (2024) -
Training-Free Activation Sparsity in Large Language Models
by: Liu, James, et al.
Published: (2024)