:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shen, Yikang, Guo, Zhen, Cai, Tianle, Qin, Zengyi
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2404.07413
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
by: Guo, Zhen, et al.
Published: (2024)

MGH Radiology Llama: A Llama 3 70B Model for Radiology
by: Shi, Yucheng, et al.
Published: (2024)

Synthetic Data RL: Task Definition Is All You Need
by: Guo, Yiduo, et al.
Published: (2025)

BURExtract-Llama: An LLM for Clinical Concept Extraction in Breast Ultrasound Reports
by: Chen, Yuxuan, et al.
Published: (2024)

Training-Free Activation Sparsity in Large Language Models
by: Liu, James, et al.
Published: (2024)

Investigating Bias Representations in Llama 2 Chat via Activation Steering
by: Lu, Dawn, et al.
Published: (2024)

Open Llama2 Model for the Lithuanian Language
by: Nakvosas, Artūras, et al.
Published: (2024)

Benchmarking Linguistic Adaptation in Comparable-Sized LLMs: A Study of Llama-3.1-8B, Mistral-7B-v0.1, and Qwen3-8B on Romanized Nepali
by: Rimal, Ananda, et al.
Published: (2026)

Applying Refusal-Vector Ablation to Llama 3.1 70B Agents
by: Lermen, Simon, et al.
Published: (2024)

TinyLlama: An Open-Source Small Language Model
by: Zhang, Peiyuan, et al.
Published: (2024)

The Llama 3 Herd of Models
by: Grattafiori, Aaron, et al.
Published: (2024)

Steering Llama 2 via Contrastive Activation Addition
by: Panickssery, Nina, et al.
Published: (2023)

ReachAgent: Enhancing Mobile Agent via Page Reaching and Operation
by: Wu, Qinzhuo, et al.
Published: (2025)

Llama-Nemotron: Efficient Reasoning Models
by: Bercovich, Akhiad, et al.
Published: (2025)

Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
by: Du, Wenyu, et al.
Published: (2024)

Open-SQL Framework: Enhancing Text-to-SQL on Open-source Large Language Models
by: Chen, Xiaojun, et al.
Published: (2024)

GreedLlama: Performance of Financial Value-Aligned Large Language Models in Moral Reasoning
by: Yu, Jeffy, et al.
Published: (2024)

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
by: Zheng, Yaowei, et al.
Published: (2024)

AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs
by: Fathullah, Yassir, et al.
Published: (2023)

Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment
by: Agarwalla, Abhinav, et al.
Published: (2024)

LLaDA-MoE: A Sparse MoE Diffusion Language Model
by: Zhu, Fengqi, et al.
Published: (2025)

BanglaLlama: LLaMA for Bangla Language
by: Zehady, Abdullah Khan, et al.
Published: (2024)

See or Say Graphs: Agent-Driven Scalable Graph Structure Understanding with Vision-Language Models
by: Han, Shuo, et al.
Published: (2025)

Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
by: Tan, Shawn, et al.
Published: (2024)

TableLlama: Towards Open Large Generalist Models for Tables
by: Zhang, Tianshu, et al.
Published: (2023)

Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data
by: Li, Yunxin, et al.
Published: (2025)

OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale
by: Shi, Jingze, et al.
Published: (2026)

Sigma-MoE-Tiny Technical Report
by: Hu, Qingguo, et al.
Published: (2025)

Yuan3.0 Ultra: A Trillion-Parameter Enterprise-Oriented MoE LLM
by: ai, YuanLab., et al.
Published: (2026)

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
by: Qin, Zhen, et al.
Published: (2024)

Forbidden Facts: An Investigation of Competing Objectives in Llama-2
by: Wang, Tony T., et al.
Published: (2023)

LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing
by: Sun, Maojun
Published: (2024)

Assessing the Ability of Neural TTS Systems to Model Consonant-Induced F0 Perturbation
by: Yang, Tianle, et al.
Published: (2026)

Finding the Minimal Parameter Budget for Implicit Reasoning: A Data Complexity Driven Scaling Law for Language Models
by: Wang, Xinyi, et al.
Published: (2025)

A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2
by: Pietroń, Marcin, et al.
Published: (2026)

LlamaTurk: Adapting Open-Source Generative Large Language Models for Low-Resource Language
by: Toraman, Cagri
Published: (2024)

Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian
by: Hoffmann, Michael, et al.
Published: (2025)

LEGO-GraphRAG: Modularizing Graph-based Retrieval-Augmented Generation for Design Space Exploration
by: Cao, Yukun, et al.
Published: (2024)

Fragile Knowledge, Robust Instruction-Following: The Width Pruning Dichotomy in Llama-3.2
by: Martra, Pere
Published: (2025)

Parameter Efficient Fine Tuning Llama 3.1 for Answering Arabic Legal Questions: A Case Study on Jordanian Laws
by: Fasha, Mohammed, et al.
Published: (2026)