:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Joshi, Raviraj, Paul, Rakesh, Singla, Kanishk, Kamath, Anusha, Evans, Michael, Luna, Katherine, Ghosh, Shaona, Vaidya, Utkarsh, Long, Eileen, Chauhan, Sanjay Singh, Wartikar, Niranjan
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Computation and Language Machine Learning
Online-Zugang:	https://arxiv.org/abs/2508.01710
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Benchmarking Hindi LLMs: A New Suite of Datasets and a Comparative Analysis
von: Kamath, Anusha, et al.
Veröffentlicht: (2025)

Aligning Large Language Models to Low-Resource Languages through LLM-Based Selective Translation: A Systematic Study
von: Paul, Rakesh, et al.
Veröffentlicht: (2025)

Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus
von: Joshi, Raviraj, et al.
Veröffentlicht: (2024)

SEA-Guard: Culturally Grounded Multilingual Safeguard for Southeast Asia
von: Tasawong, Panuthep, et al.
Veröffentlicht: (2026)

X-Guard: Multilingual Guard Agent for Content Moderation
von: Upadhayay, Bibek, et al.
Veröffentlicht: (2025)

MrGuard: A Multilingual Reasoning Guardrail for Universal LLM Safety
von: Yang, Yahan, et al.
Veröffentlicht: (2025)

PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
von: Kumar, Priyanshu, et al.
Veröffentlicht: (2025)

UbuntuGuard: A Culturally-Grounded Policy Benchmark for Equitable AI Safety in African Languages
von: Abdullahi, Tassallah, et al.
Veröffentlicht: (2026)

Guarding Terrains with Guards on a Line
von: Kang, Byeonguk, et al.
Veröffentlicht: (2025)

IndicSQuAD: A Comprehensive Multilingual Question Answering Dataset for Indic Languages
von: Endait, Sharvi, et al.
Veröffentlicht: (2025)

FanarGuard: A Culturally-Aware Moderation Filter for Arabic Language Models
von: Fatehkia, Masoomali, et al.
Veröffentlicht: (2025)

Can One Safety Loop Guard Them All? Agentic Guard Rails for Federated Computing
von: Veeraragavan, Narasimha Raghavan, et al.
Veröffentlicht: (2025)

L3Cube-MahaSTS: A Marathi Sentence Similarity Dataset and Models
von: Mirashi, Aishwarya, et al.
Veröffentlicht: (2025)

ML-Bench&Guard: Policy-Grounded Multilingual Safety Benchmark and Guardrail for Large Language Models
von: Zhao, Yunhan, et al.
Veröffentlicht: (2026)

Poly-Guard: Massive Multi-Domain Safety Policy-Grounded Guardrail Dataset
von: Kang, Mintong, et al.
Veröffentlicht: (2025)

L3Cube-MahaEmotions: A Marathi Emotion Recognition Dataset with Synthetic Annotations using CoTR prompting and Large Language Models
von: Kowtal, Nidhi, et al.
Veröffentlicht: (2025)

SafeSteer: Interpretable Safety Steering with Refusal-Evasion in LLMs
von: Ghosh, Shaona, et al.
Veröffentlicht: (2025)

DiffusionGuard
von: Lei, Yefei
Veröffentlicht: (2026)

MalGuard
von: ossgraud
Veröffentlicht: (2025)

AprielGuard
von: Kasundra, Jaykumar, et al.
Veröffentlicht: (2025)

The Guard And The Formula
von: Carlos Eduardo Paletta Guedes
Veröffentlicht: (2019)

The Changing of the Guard
Veröffentlicht: (1972)

PoseGuard: Pose-Guided Generation with Safety Guardrails
von: Wang, Kongxin, et al.
Veröffentlicht: (2025)

PL-Guard: Benchmarking Language Model Safety for Polish
von: Krasnodębska, Aleksandra, et al.
Veröffentlicht: (2025)

Dirt, Dwellings and Culture
von: Reilly, Eileen
Veröffentlicht: (2024)

AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
von: Ghosh, Shaona, et al.
Veröffentlicht: (2024)

Towards Inference-time Category-wise Safety Steering for Large Language Models
von: Bhattacharjee, Amrita, et al.
Veröffentlicht: (2024)

Surfacing Semantic Orthogonality Across Model Safety Benchmarks: A Multi-Dimensional Analysis
von: Bennion, Jonathan, et al.
Veröffentlicht: (2025)

Better To Ask in English? Evaluating Factual Accuracy of Multilingual LLMs in English and Low-Resource Languages
von: Rohera, Pritika, et al.
Veröffentlicht: (2025)

Regularising Spectral Curves for Homogeneous Yang-Baxter strings
von: Driezen, Sibylle, et al.
Veröffentlicht: (2024)

Guarding the Meaning: Self-Supervised Training for Semantic Robustness in Guard Models
von: Pinneri, Cristina, et al.
Veröffentlicht: (2025)

RefusalGuard: Geometry-Preserving Fine-Tuning for Safety in LLMs
von: Asif, Sadia, et al.
Veröffentlicht: (2026)

DiffGuard: Text-Based Safety Checker for Diffusion Models
von: Khader, Massine El, et al.
Veröffentlicht: (2024)

AlignGuard: Scalable Safety Alignment for Text-to-Image Generation
von: Liu, Runtao, et al.
Veröffentlicht: (2024)

ProbGuard: Probabilistic Runtime Monitoring for LLM Agent Safety
von: Wang, Haoyu, et al.
Veröffentlicht: (2025)

Latent Guard: a Safety Framework for Text-to-image Generation
von: Liu, Runtao, et al.
Veröffentlicht: (2024)

Contiguous Boundary Guarding
von: Biniaz, Ahmad, et al.
Veröffentlicht: (2024)

Protecting Guard Status
Veröffentlicht: (2024)

Robustly Guarding Polygons
von: Das, Rathish, et al.
Veröffentlicht: (2024)

Who Guards the Guardians? The Challenges of Evaluating Identifiability of Learned Representations
von: Joshi, Shruti, et al.
Veröffentlicht: (2026)