Saved in:
| Main Author: | Gupta, Aayush |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.06112 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Fact Grounded Attention: Eliminating Hallucination in Large Language Models Through Attention Level Knowledge Integration
by: Gupta, Aayush
Published: (2025)
by: Gupta, Aayush
Published: (2025)
LemmaBench: A Live, Research-Level Benchmark to Evaluate LLM Capabilities in Mathematics
by: Peyronnet, Antoine, et al.
Published: (2026)
by: Peyronnet, Antoine, et al.
Published: (2026)
React-ing to Grace Hopper 200: Five Open-Weights Coding Models, One React Native App, One GH200, One Weekend
by: Potanin, Alex
Published: (2026)
by: Potanin, Alex
Published: (2026)
Aligning LLMs for Multilingual Consistency in Enterprise Applications
by: Agarwal, Amit, et al.
Published: (2025)
by: Agarwal, Amit, et al.
Published: (2025)
NeuroState-Bench: A Human-Calibrated Benchmark for Commitment Integrity in LLM Agent Profiles
by: Jia, Xiao
Published: (2026)
by: Jia, Xiao
Published: (2026)
Classifier-Augmented Generation for Structured Workflow Prediction
by: Gschwind, Thomas, et al.
Published: (2025)
by: Gschwind, Thomas, et al.
Published: (2025)
Generative AI and the Transformation of Software Development Practices
by: Acharya, Vivek
Published: (2025)
by: Acharya, Vivek
Published: (2025)
SLEAN: Simple Lightweight Ensemble Analysis Network for Multi-Provider LLM Coordination: Design, Implementation, and Vibe Coding Bug Investigation Case Study
by: Vargas, Matheus J. T.
Published: (2025)
by: Vargas, Matheus J. T.
Published: (2025)
Bug In the Code Stack: Can LLMs Find Bugs in Large Python Code Stacks
by: Lee, Hokyung, et al.
Published: (2024)
by: Lee, Hokyung, et al.
Published: (2024)
Vis-CoT: A Human-in-the-Loop Framework for Interactive Visualization and Intervention in LLM Chain-of-Thought Reasoning
by: Pather, Kaviraj, et al.
Published: (2025)
by: Pather, Kaviraj, et al.
Published: (2025)
Kodezi Chronos: A Debugging-First Language Model for Repository-Scale Code Understanding
by: Khan, Ishraq, et al.
Published: (2025)
by: Khan, Ishraq, et al.
Published: (2025)
Hallucination Detection in Large Language Models with Metamorphic Relations
by: Yang, Borui, et al.
Published: (2025)
by: Yang, Borui, et al.
Published: (2025)
From Noise to Diversity: Random Embedding Injection in LLM Reasoning
by: Kim, Heejun, et al.
Published: (2026)
by: Kim, Heejun, et al.
Published: (2026)
MEDLEY-BENCH: Scale Buys Evaluation but Not Control in AI Metacognition
by: Abtahi, Farhad, et al.
Published: (2026)
by: Abtahi, Farhad, et al.
Published: (2026)
Co-NAML-LSTUR: A Combined Model with Attentive Multi-View Learning and Long- and Short-term User Representations for News Recommendation
by: Nguyen, Minh Hoang, et al.
Published: (2025)
by: Nguyen, Minh Hoang, et al.
Published: (2025)
When No Benchmark Exists: Validating Comparative LLM Safety Scoring Without Ground-Truth Labels
by: Gautam, Sushant, et al.
Published: (2026)
by: Gautam, Sushant, et al.
Published: (2026)
A Method for Quantifying Human Risk and a Blueprint for LLM Integration
by: Canale, Giuseppe
Published: (2025)
by: Canale, Giuseppe
Published: (2025)
Harnessing non-adversarial robustness in large language models
by: Zhou, Qinghua, et al.
Published: (2026)
by: Zhou, Qinghua, et al.
Published: (2026)
Unraveling Media Perspectives: A Comprehensive Methodology Combining Large Language Models, Topic Modeling, Sentiment Analysis, and Ontology Learning to Analyse Media Bias
by: Jähde, Orlando, et al.
Published: (2025)
by: Jähde, Orlando, et al.
Published: (2025)
BreakFun: Jailbreaking LLMs via Schema Exploitation
by: Oskooei, Amirkia Rafiei, et al.
Published: (2025)
by: Oskooei, Amirkia Rafiei, et al.
Published: (2025)
A Computational Approach to Modeling Conversational Systems: Analyzing Large-Scale Quasi-Patterned Dialogue Flows
by: Ammar, Mohamed Achref Ben, et al.
Published: (2025)
by: Ammar, Mohamed Achref Ben, et al.
Published: (2025)
Systematic Capability Benchmarking of Frontier Large Language Models for Offensive Cyber Tasks
by: Merves, Tyler H., et al.
Published: (2026)
by: Merves, Tyler H., et al.
Published: (2026)
How much do LLMs learn from negative examples?
by: Hamdan, Shadi, et al.
Published: (2025)
by: Hamdan, Shadi, et al.
Published: (2025)
Approaches to Semantic Textual Similarity in Slovak Language: From Algorithms to Transformers
by: Radosky, Lukas, et al.
Published: (2026)
by: Radosky, Lukas, et al.
Published: (2026)
Robustness, Cost, and Attack-Surface Concentration in Phishing Detection
by: Allagan, Julian, et al.
Published: (2026)
by: Allagan, Julian, et al.
Published: (2026)
MedMemoryBench: Benchmarking Agent Memory in Personalized Healthcare
by: Wang, Yihao, et al.
Published: (2026)
by: Wang, Yihao, et al.
Published: (2026)
Supporting software engineering tasks with agentic AI: Demonstration on document retrieval and test scenario generation
by: Kica, Marian, et al.
Published: (2026)
by: Kica, Marian, et al.
Published: (2026)
Enhancing OCR for Sino-Vietnamese Language Processing via Fine-tuned PaddleOCRv5
by: Nguyen, Minh Hoang, et al.
Published: (2025)
by: Nguyen, Minh Hoang, et al.
Published: (2025)
Multi-Agent Object Detection Framework Based on Raspberry Pi YOLO Detector and Slack-Ollama Natural Language Interface
by: Kalušev, Vladimir, et al.
Published: (2026)
by: Kalušev, Vladimir, et al.
Published: (2026)
LLM-supported document separation for printed reviews from zbMATH Open
by: Pluzhnikov, Ivan, et al.
Published: (2026)
by: Pluzhnikov, Ivan, et al.
Published: (2026)
Tool-Genesis: A Task-Driven Tool Creation Benchmark for Self-Evolving Language Agent
by: Xia, Bowei, et al.
Published: (2026)
by: Xia, Bowei, et al.
Published: (2026)
Dark LLMs: The Growing Threat of Unaligned AI Models
by: Fire, Michael, et al.
Published: (2025)
by: Fire, Michael, et al.
Published: (2025)
DRS-OSS: Practical Diff Risk Scoring with LLMs
by: Sayedsalehi, Ali, et al.
Published: (2025)
by: Sayedsalehi, Ali, et al.
Published: (2025)
The Pursuit of Empathy: Evaluating Small Language Models for PTSD Dialogue Support
by: BN, Suhas, et al.
Published: (2025)
by: BN, Suhas, et al.
Published: (2025)
Mechanistic Analysis of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning
by: Imanov, Olaf Yunus Laitinen
Published: (2026)
by: Imanov, Olaf Yunus Laitinen
Published: (2026)
Latent Object Permanence: Topological Phase Transitions, Free-Energy Principles, and Renormalization Group Flows in Deep Transformer Manifolds
by: Alpay, Faruk, et al.
Published: (2026)
by: Alpay, Faruk, et al.
Published: (2026)
Inference acceleration for large language models using "stairs" assisted greedy generation
by: Grigaliūnas, Domas, et al.
Published: (2024)
by: Grigaliūnas, Domas, et al.
Published: (2024)
Strategic Doctrine Language Models (sdLM): A Learning-System Framework for Doctrinal Consistency and Geopolitical Forecasting
by: Imanov, Olaf Yunus Laitinen, et al.
Published: (2026)
by: Imanov, Olaf Yunus Laitinen, et al.
Published: (2026)
Multi-Agent Synergy-Driven Iterative Visual Narrative Synthesis
by: Xi, Wang, et al.
Published: (2025)
by: Xi, Wang, et al.
Published: (2025)
Generative AI Models: Opportunities and Risks for Industry and Authorities
by: Alt, Tobias, et al.
Published: (2024)
by: Alt, Tobias, et al.
Published: (2024)
Similar Items
-
Fact Grounded Attention: Eliminating Hallucination in Large Language Models Through Attention Level Knowledge Integration
by: Gupta, Aayush
Published: (2025) -
LemmaBench: A Live, Research-Level Benchmark to Evaluate LLM Capabilities in Mathematics
by: Peyronnet, Antoine, et al.
Published: (2026) -
React-ing to Grace Hopper 200: Five Open-Weights Coding Models, One React Native App, One GH200, One Weekend
by: Potanin, Alex
Published: (2026) -
Aligning LLMs for Multilingual Consistency in Enterprise Applications
by: Agarwal, Amit, et al.
Published: (2025) -
NeuroState-Bench: A Human-Calibrated Benchmark for Commitment Integrity in LLM Agent Profiles
by: Jia, Xiao
Published: (2026)