:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Nadeau, David, Kroutikov, Mike, McNeil, Karen, Baribeau, Simon
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2404.09785
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Vision-Language and Large Language Model Performance in Gastroenterology: GPT, Claude, Llama, Phi, Mistral, Gemma, and Quantized Models
by: Safavi-Naini, Seyed Amir Ahmad, et al.
Published: (2024)

Generative AI in Academic Writing: A Comparison of DeepSeek, Qwen, ChatGPT, Gemini, Llama, Mistral, and Gemma
by: Aydin, Omer, et al.
Published: (2025)

From Bytes to Borsch: Fine-Tuning Gemma and Mistral for the Ukrainian Language Representation
by: Kiulian, Artur, et al.
Published: (2024)

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
by: Thawakar, Omkar, et al.
Published: (2024)

Visualizing and Benchmarking LLM Factual Hallucination Tendencies via Internal State Analysis and Clustering
by: Mao, Nathan, et al.
Published: (2026)

BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B
by: Gade, Pranav, et al.
Published: (2023)

On Early Detection of Hallucinations in Factual Question Answering
by: Snyder, Ben, et al.
Published: (2023)

Investigating Bias Representations in Llama 2 Chat via Activation Steering
by: Lu, Dawn, et al.
Published: (2024)

Investigating Symbolic Triggers of Hallucination in Gemma Models Across HaluEval and TruthfulQA
by: Lamba, Naveen, et al.
Published: (2025)

Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning
by: Wang, Shengyuan, et al.
Published: (2025)

Reducing Hallucinations in LLMs via Factuality-Aware Preference Learning
by: Chaduvula, Sindhuja, et al.
Published: (2026)

CodeGemma: Open Code Models Based on Gemma
by: CodeGemma Team, et al.
Published: (2024)

A Comparative Benchmark of a Moroccan Darija Toxicity Detection Model (Typica.ai) and Major LLM-Based Moderation APIs (OpenAI, Mistral, Anthropic)
by: Assoudi, Hicham
Published: (2025)

ShieldGemma: Generative AI Content Moderation Based on Gemma
by: Zeng, Wenjun, et al.
Published: (2024)

GKnow: Measuring the Entanglement of Gender Bias and Factual Gender
by: Veloso, Leonor, et al.
Published: (2026)

PretrainRL: Alleviating Factuality Hallucination of Large Language Models at the Beginning
by: Liu, Langming, et al.
Published: (2026)

Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis and Interpretation
by: Dang, Renfei, et al.
Published: (2025)

Benchmarking Linguistic Adaptation in Comparable-Sized LLMs: A Study of Llama-3.1-8B, Mistral-7B-v0.1, and Qwen3-8B on Romanized Nepali
by: Rimal, Ananda, et al.
Published: (2026)

Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations
by: Yu, Lei, et al.
Published: (2024)

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
by: Lieberum, Tom, et al.
Published: (2024)

The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models
by: Li, Junyi, et al.
Published: (2024)

Not All That Is Fluent Is Factual: Investigating Hallucinations of Large Language Models in Academic Writing
by: Khan, Humam, et al.
Published: (2026)

Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge Utilization
by: Zhang, Siyuan, et al.
Published: (2025)

JointCQ: Improving Factual Hallucination Detection with Joint Claim and Query Generation
by: Xu, Fan, et al.
Published: (2025)

AutoHall: Automated Factuality Hallucination Dataset Generation for Large Language Models
by: Cao, Zouying, et al.
Published: (2023)

Predicting Sentence-Level Factuality of News and Bias of Media Outlets
by: Vargas, Francielle, et al.
Published: (2023)

Linq-Embed-Mistral Technical Report
by: Choi, Chanyeol, et al.
Published: (2024)

FFT: Towards Harmlessness Evaluation and Analysis for LLMs with Factuality, Fairness, Toxicity
by: Cui, Shiyao, et al.
Published: (2023)

TranslateGemma Technical Report
by: Finkelstein, Mara, et al.
Published: (2026)

Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
by: Zhang, Xiaoying, et al.
Published: (2024)

T5Gemma 2: Seeing, Reading, and Understanding Longer
by: Zhang, Biao, et al.
Published: (2025)

ChocoLlama: Lessons Learned From Teaching Llamas Dutch
by: Meeus, Matthieu, et al.
Published: (2024)

Context-Efficient Retrieval with Factual Decomposition
by: Li, Yanhong, et al.
Published: (2025)

Enhancing Next-Generation Language Models with Knowledge Graphs: Extending Claude, Mistral IA, and GPT-4 via KG-BERT
by: Chaabene, Nour El Houda Ben, et al.
Published: (2025)

TathyaNyaya and FactLegalLlama: Advancing Factual Judgment Prediction and Explanation in the Indian Legal Context
by: Nigam, Shubham Kumar, et al.
Published: (2025)

Gemma 3 Technical Report
by: Gemma Team, et al.
Published: (2025)

Evaluating Students' Open-ended Written Responses with LLMs: Using the RAG Framework for GPT-3.5, GPT-4, Claude-3, and Mistral-Large
by: Jauhiainen, Jussi S., et al.
Published: (2024)

When Benchmarks Age: Temporal Misalignment through Large Language Model Factuality Evaluation
by: Jiang, Xunyi, et al.
Published: (2025)

MedGemma Technical Report
by: Sellergren, Andrew, et al.
Published: (2025)

100% Elimination of Hallucinations on RAGTruth for GPT-4 and GPT-3.5 Turbo
by: Wood, Michael C., et al.
Published: (2024)