:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Rimal, Ananda, Rimal, Adarsha
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2604.14171
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Introducing Super RAGs in Mistral 8x7B-v1
by: Thakur, Ayush, et al.
Published: (2024)

Towards Nepali-language LLMs: Efficient GPT training with a Nepali BPE tokenizer
by: Shrestha, Adarsha, et al.
Published: (2025)

Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders
by: He, Zhengfu, et al.
Published: (2024)

Llama-Mob: Instruction-Tuning Llama-3-8B Excels in City-Scale Mobility Prediction
by: Tang, Peizhi, et al.
Published: (2024)

Applying Refusal-Vector Ablation to Llama 3.1 70B Agents
by: Lermen, Simon, et al.
Published: (2024)

Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for Hallucinations
by: Nadeau, David, et al.
Published: (2024)

Metaheuristic Optimization Algorithm for Vulnerability Detection in Web of Things Environment
by: Romil Rawat, et al.
Published: (2026)

Code Generation and Algorithmic Problem Solving Using Llama 3.1 405B
by: Deroy, Aniket, et al.
Published: (2024)

Advances in Complex Oxide Quantum Materials Through New Approaches to Molecular Beam Epitaxy
by: Rimal, Gaurab, et al.
Published: (2023)

Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection
by: Chen, Tianxiang, et al.
Published: (2024)

MGH Radiology Llama: A Llama 3 70B Model for Radiology
by: Shi, Yucheng, et al.
Published: (2024)

Lisbon Computational Linguists at SemEval-2024 Task 2: Using A Mistral 7B Model and Data Augmentation
by: Guimarães, Artur, et al.
Published: (2024)

EXAONE 3.0 7.8B Instruction Tuned Language Model
by: An, Soyoung, et al.
Published: (2024)

Extending LLMs to New Languages: A Case Study of Llama and Persian Adaptation
by: Sani, Samin Mahdizadeh, et al.
Published: (2024)

Concepts Whisper While Syntax Shouts: Spectral Anti-Concentration and the Dual Geometry of Transformer Representations
by: Acharya, Pratyush, et al.
Published: (2026)

Graph Attention Network-Based Detection of Autism Spectrum Disorder
by: Kelly, Abigail, et al.
Published: (2026)

LinguIUTics at PsyDefDetect: Iterative Imbalance-Aware Fine-tuning of Qwen3-8B for Psychological Defense Mechanism Classification
by: Adib, Shefayat E Shams, et al.
Published: (2026)

Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report
by: Yang, Zhuoran, et al.
Published: (2026)

Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report
by: Weerawardhena, Sajana, et al.
Published: (2025)

Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report
by: Kassianik, Paul, et al.
Published: (2025)

Qwen3 Technical Report
by: Yang, An, et al.
Published: (2025)

Fine-Tuning Qwen 2.5 3B for Realistic Movie Dialogue Generation
by: Gupta, Kartik
Published: (2025)

Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation
by: Ociepa, Krzysztof, et al.
Published: (2024)

Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks
by: Babakhin, Yauhen, et al.
Published: (2025)

Qwen3Guard Technical Report
by: Zhao, Haiquan, et al.
Published: (2025)

NepaliGPT: A Generative Language Model for the Nepali Language
by: Pudasaini, Shushanta, et al.
Published: (2025)

Generative AI in Academic Writing: A Comparison of DeepSeek, Qwen, ChatGPT, Gemini, Llama, Mistral, and Gemma
by: Aydin, Omer, et al.
Published: (2025)

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text
by: Kandpal, Nikhil, et al.
Published: (2025)

Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3
by: Yoon, Junsang, et al.
Published: (2024)

Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training and Model Merging: A Comprehensive Evaluation
by: Siriwardhana, Shamane, et al.
Published: (2024)

Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and High-Performance Models for Multilingual ASR and AST
by: Sekoyan, Monica, et al.
Published: (2025)

JetMoE: Reaching Llama2 Performance with 0.1M Dollars
by: Shen, Yikang, et al.
Published: (2024)

Benchmarking EngGPT2-16B-A3B against Comparable Italian and International Open-source LLMs
by: Sassella, Andrea, et al.
Published: (2026)

BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B
by: Gade, Pranav, et al.
Published: (2023)

Qwen3-Coder-Next Technical Report
by: Cao, Ruisheng, et al.
Published: (2026)

Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking
by: Li, Mingxin, et al.
Published: (2026)

Parameter Efficient Fine Tuning Llama 3.1 for Answering Arabic Legal Questions: A Case Study on Jordanian Laws
by: Fasha, Mohammed, et al.
Published: (2026)

Consolidating and Developing Benchmarking Datasets for the Nepali Natural Language Understanding Tasks
by: Nyachhyon, Jinu, et al.
Published: (2024)

Development of CNN Architectures using Transfer Learning Methods for Medical Image Classification
by: Basyal, Ganga Prasad, et al.
Published: (2024)

Integrated Phytochemical and Bioactivity Evaluation of Ocimum tenuiflorum Linn. Essential Oil Against Multidrug‐Resistant Aeromonas spp.
by: Sanghamitra Buragohain, et al.
Published: (2026)