Saved in:
| Main Authors: | Rimal, Ananda, Rimal, Adarsha |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.14171 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Introducing Super RAGs in Mistral 8x7B-v1
by: Thakur, Ayush, et al.
Published: (2024)
by: Thakur, Ayush, et al.
Published: (2024)
Towards Nepali-language LLMs: Efficient GPT training with a Nepali BPE tokenizer
by: Shrestha, Adarsha, et al.
Published: (2025)
by: Shrestha, Adarsha, et al.
Published: (2025)
Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders
by: He, Zhengfu, et al.
Published: (2024)
by: He, Zhengfu, et al.
Published: (2024)
Llama-Mob: Instruction-Tuning Llama-3-8B Excels in City-Scale Mobility Prediction
by: Tang, Peizhi, et al.
Published: (2024)
by: Tang, Peizhi, et al.
Published: (2024)
Applying Refusal-Vector Ablation to Llama 3.1 70B Agents
by: Lermen, Simon, et al.
Published: (2024)
by: Lermen, Simon, et al.
Published: (2024)
Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for Hallucinations
by: Nadeau, David, et al.
Published: (2024)
by: Nadeau, David, et al.
Published: (2024)
Metaheuristic Optimization Algorithm for Vulnerability Detection in Web of Things Environment
by: Romil Rawat, et al.
Published: (2026)
by: Romil Rawat, et al.
Published: (2026)
Code Generation and Algorithmic Problem Solving Using Llama 3.1 405B
by: Deroy, Aniket, et al.
Published: (2024)
by: Deroy, Aniket, et al.
Published: (2024)
Advances in Complex Oxide Quantum Materials Through New Approaches to Molecular Beam Epitaxy
by: Rimal, Gaurab, et al.
Published: (2023)
by: Rimal, Gaurab, et al.
Published: (2023)
Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection
by: Chen, Tianxiang, et al.
Published: (2024)
by: Chen, Tianxiang, et al.
Published: (2024)
MGH Radiology Llama: A Llama 3 70B Model for Radiology
by: Shi, Yucheng, et al.
Published: (2024)
by: Shi, Yucheng, et al.
Published: (2024)
Lisbon Computational Linguists at SemEval-2024 Task 2: Using A Mistral 7B Model and Data Augmentation
by: Guimarães, Artur, et al.
Published: (2024)
by: Guimarães, Artur, et al.
Published: (2024)
EXAONE 3.0 7.8B Instruction Tuned Language Model
by: An, Soyoung, et al.
Published: (2024)
by: An, Soyoung, et al.
Published: (2024)
Extending LLMs to New Languages: A Case Study of Llama and Persian Adaptation
by: Sani, Samin Mahdizadeh, et al.
Published: (2024)
by: Sani, Samin Mahdizadeh, et al.
Published: (2024)
Concepts Whisper While Syntax Shouts: Spectral Anti-Concentration and the Dual Geometry of Transformer Representations
by: Acharya, Pratyush, et al.
Published: (2026)
by: Acharya, Pratyush, et al.
Published: (2026)
Graph Attention Network-Based Detection of Autism Spectrum Disorder
by: Kelly, Abigail, et al.
Published: (2026)
by: Kelly, Abigail, et al.
Published: (2026)
LinguIUTics at PsyDefDetect: Iterative Imbalance-Aware Fine-tuning of Qwen3-8B for Psychological Defense Mechanism Classification
by: Adib, Shefayat E Shams, et al.
Published: (2026)
by: Adib, Shefayat E Shams, et al.
Published: (2026)
Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report
by: Yang, Zhuoran, et al.
Published: (2026)
by: Yang, Zhuoran, et al.
Published: (2026)
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report
by: Weerawardhena, Sajana, et al.
Published: (2025)
by: Weerawardhena, Sajana, et al.
Published: (2025)
Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report
by: Kassianik, Paul, et al.
Published: (2025)
by: Kassianik, Paul, et al.
Published: (2025)
Qwen3 Technical Report
by: Yang, An, et al.
Published: (2025)
by: Yang, An, et al.
Published: (2025)
Fine-Tuning Qwen 2.5 3B for Realistic Movie Dialogue Generation
by: Gupta, Kartik
Published: (2025)
by: Gupta, Kartik
Published: (2025)
Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation
by: Ociepa, Krzysztof, et al.
Published: (2024)
by: Ociepa, Krzysztof, et al.
Published: (2024)
Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks
by: Babakhin, Yauhen, et al.
Published: (2025)
by: Babakhin, Yauhen, et al.
Published: (2025)
Qwen3Guard Technical Report
by: Zhao, Haiquan, et al.
Published: (2025)
by: Zhao, Haiquan, et al.
Published: (2025)
NepaliGPT: A Generative Language Model for the Nepali Language
by: Pudasaini, Shushanta, et al.
Published: (2025)
by: Pudasaini, Shushanta, et al.
Published: (2025)
Generative AI in Academic Writing: A Comparison of DeepSeek, Qwen, ChatGPT, Gemini, Llama, Mistral, and Gemma
by: Aydin, Omer, et al.
Published: (2025)
by: Aydin, Omer, et al.
Published: (2025)
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text
by: Kandpal, Nikhil, et al.
Published: (2025)
by: Kandpal, Nikhil, et al.
Published: (2025)
Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3
by: Yoon, Junsang, et al.
Published: (2024)
by: Yoon, Junsang, et al.
Published: (2024)
Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training and Model Merging: A Comprehensive Evaluation
by: Siriwardhana, Shamane, et al.
Published: (2024)
by: Siriwardhana, Shamane, et al.
Published: (2024)
Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and High-Performance Models for Multilingual ASR and AST
by: Sekoyan, Monica, et al.
Published: (2025)
by: Sekoyan, Monica, et al.
Published: (2025)
JetMoE: Reaching Llama2 Performance with 0.1M Dollars
by: Shen, Yikang, et al.
Published: (2024)
by: Shen, Yikang, et al.
Published: (2024)
Benchmarking EngGPT2-16B-A3B against Comparable Italian and International Open-source LLMs
by: Sassella, Andrea, et al.
Published: (2026)
by: Sassella, Andrea, et al.
Published: (2026)
BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B
by: Gade, Pranav, et al.
Published: (2023)
by: Gade, Pranav, et al.
Published: (2023)
Qwen3-Coder-Next Technical Report
by: Cao, Ruisheng, et al.
Published: (2026)
by: Cao, Ruisheng, et al.
Published: (2026)
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking
by: Li, Mingxin, et al.
Published: (2026)
by: Li, Mingxin, et al.
Published: (2026)
Parameter Efficient Fine Tuning Llama 3.1 for Answering Arabic Legal Questions: A Case Study on Jordanian Laws
by: Fasha, Mohammed, et al.
Published: (2026)
by: Fasha, Mohammed, et al.
Published: (2026)
Consolidating and Developing Benchmarking Datasets for the Nepali Natural Language Understanding Tasks
by: Nyachhyon, Jinu, et al.
Published: (2024)
by: Nyachhyon, Jinu, et al.
Published: (2024)
Development of CNN Architectures using Transfer Learning Methods for Medical Image Classification
by: Basyal, Ganga Prasad, et al.
Published: (2024)
by: Basyal, Ganga Prasad, et al.
Published: (2024)
Integrated Phytochemical and Bioactivity Evaluation of Ocimum tenuiflorum Linn. Essential Oil Against Multidrug‐Resistant Aeromonas spp.
by: Sanghamitra Buragohain, et al.
Published: (2026)
by: Sanghamitra Buragohain, et al.
Published: (2026)
Similar Items
-
Introducing Super RAGs in Mistral 8x7B-v1
by: Thakur, Ayush, et al.
Published: (2024) -
Towards Nepali-language LLMs: Efficient GPT training with a Nepali BPE tokenizer
by: Shrestha, Adarsha, et al.
Published: (2025) -
Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders
by: He, Zhengfu, et al.
Published: (2024) -
Llama-Mob: Instruction-Tuning Llama-3-8B Excels in City-Scale Mobility Prediction
by: Tang, Peizhi, et al.
Published: (2024) -
Applying Refusal-Vector Ablation to Llama 3.1 70B Agents
by: Lermen, Simon, et al.
Published: (2024)