Saved in:
| Main Authors: | Lu, Junyu, Ma, Kai, Wang, Kaichun, Xiao, Kelaiti, Lee, Roy Ka-Wei, Xu, Bo, Yang, Liang, Lin, Hongfei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.06207 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Visual Puns from Idioms: An Iterative LLM-T2IM-MLLM Framework
by: Xiao, Kelaiti, et al.
Published: (2025)
by: Xiao, Kelaiti, et al.
Published: (2025)
Aligning LLM Uncertainty with Human Disagreement in Subjectivity Analysis
by: Lu, Junyu, et al.
Published: (2026)
by: Lu, Junyu, et al.
Published: (2026)
VisualQuest: A Benchmark for Abstract Visual Reasoning in MLLMs
by: Xiao, Kelaiti, et al.
Published: (2025)
by: Xiao, Kelaiti, et al.
Published: (2025)
ToxiCloakCN: Evaluating Robustness of Offensive Language Detection in Chinese with Cloaking Perturbations
by: Xiao, Yunze, et al.
Published: (2024)
by: Xiao, Yunze, et al.
Published: (2024)
Multi-Agent VLMs Guided Self-Training with PNU Loss for Low-Resource Offensive Content Detection
by: Wang, Han, et al.
Published: (2025)
by: Wang, Han, et al.
Published: (2025)
Harder to Defend: Towards Chinese Toxicity Attacks via Implicit Enhancement and Obfuscation Rewriting
by: Kang, Jingyi, et al.
Published: (2026)
by: Kang, Jingyi, et al.
Published: (2026)
Towards Patronizing and Condescending Language in Chinese Videos: A Multimodal Dataset and Detector
by: Wang, Hongbo, et al.
Published: (2024)
by: Wang, Hongbo, et al.
Published: (2024)
PclGPT: A Large Language Model for Patronizing and Condescending Language Detection
by: Wang, Hongbo, et al.
Published: (2024)
by: Wang, Hongbo, et al.
Published: (2024)
Overconfidence in LLM-as-a-Judge: Diagnosis and Confidence-Driven Solution
by: Tian, Zailong, et al.
Published: (2025)
by: Tian, Zailong, et al.
Published: (2025)
Take its Essence, Discard its Dross! Debiasing for Toxic Language Detection via Counterfactual Causal Effect
by: Lu, Junyu, et al.
Published: (2024)
by: Lu, Junyu, et al.
Published: (2024)
Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive
by: Weerasooriya, Tharindu Cyril, et al.
Published: (2023)
by: Weerasooriya, Tharindu Cyril, et al.
Published: (2023)
D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation
by: Davani, Aida Mostafazadeh, et al.
Published: (2024)
by: Davani, Aida Mostafazadeh, et al.
Published: (2024)
Integrating Multi-view Analysis: Multi-view Mixture-of-Expert for Textual Personality Detection
by: Zhu, Haohao, et al.
Published: (2024)
by: Zhu, Haohao, et al.
Published: (2024)
When Disagreements Elicit Robustness: Investigating Self-Repair Capabilities under LLM Multi-Agent Disagreements
by: Ju, Tianjie, et al.
Published: (2025)
by: Ju, Tianjie, et al.
Published: (2025)
From Text to Emotion: Unveiling the Emotion Annotation Capabilities of LLMs
by: Niu, Minxue, et al.
Published: (2024)
by: Niu, Minxue, et al.
Published: (2024)
Guardians of Discourse: Evaluating LLMs on Multilingual Offensive Language Detection
by: He, Jianfei, et al.
Published: (2024)
by: He, Jianfei, et al.
Published: (2024)
Towards Comprehensive Detection of Chinese Harmful Memes
by: Lu, Junyu, et al.
Published: (2024)
by: Lu, Junyu, et al.
Published: (2024)
HateClipSeg: A Segment-Level Annotated Dataset for Fine-Grained Hate Video Detection
by: Wang, Han, et al.
Published: (2025)
by: Wang, Han, et al.
Published: (2025)
Chinese Offensive Language Detection:Current Status and Future Directions
by: Xiao, Yunze, et al.
Published: (2024)
by: Xiao, Yunze, et al.
Published: (2024)
Judging the Judges: A Systematic Study of Position Bias in LLM-as-a-Judge
by: Shi, Lin, et al.
Published: (2024)
by: Shi, Lin, et al.
Published: (2024)
With Great Capabilities Come Great Responsibilities: Introducing the Agentic Risk & Capability Framework for Governing Agentic AI Systems
by: Khoo, Shaun, et al.
Published: (2025)
by: Khoo, Shaun, et al.
Published: (2025)
Language, Culture, and Ideology: Personalizing Offensiveness Detection in Political Tweets with Reasoning LLMs
by: Pihulski, Dzmitry, et al.
Published: (2025)
by: Pihulski, Dzmitry, et al.
Published: (2025)
OCCULT: Evaluating Large Language Models for Offensive Cyber Operation Capabilities
by: Kouremetis, Michael, et al.
Published: (2025)
by: Kouremetis, Michael, et al.
Published: (2025)
Evaluating Annotation Consistency in Offensive Language Detection: A Data Analytics Approach on the TweetEval Dataset
by: Fabeela Ali Rawther,Abhinay A K,Anagha Tess B,Alan Joseph,Adham Saheer
Published: (2025)
by: Fabeela Ali Rawther,Abhinay A K,Anagha Tess B,Alan Joseph,Adham Saheer
Published: (2025)
Leveraging Annotator Disagreement for Text Classification
by: Xu, Jin, et al.
Published: (2024)
by: Xu, Jin, et al.
Published: (2024)
Systematic Capability Benchmarking of Frontier Large Language Models for Offensive Cyber Tasks
by: Merves, Tyler H., et al.
Published: (2026)
by: Merves, Tyler H., et al.
Published: (2026)
Taming Overconfidence in LLMs: Reward Calibration in RLHF
by: Leng, Jixuan, et al.
Published: (2024)
by: Leng, Jixuan, et al.
Published: (2024)
Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs
by: Xu, Chenjun, et al.
Published: (2025)
by: Xu, Chenjun, et al.
Published: (2025)
The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs
by: Calderon, Nitay, et al.
Published: (2025)
by: Calderon, Nitay, et al.
Published: (2025)
Heterogeneous Judge-Aware Ranking with Sensitivity, Disagreement, and Confidence
by: Yu, Shibo, et al.
Published: (2026)
by: Yu, Shibo, et al.
Published: (2026)
Towards Effective Offensive Security LLM Agents: Hyperparameter Tuning, LLM as a Judge, and a Lightweight CTF Benchmark
by: Shao, Minghao, et al.
Published: (2025)
by: Shao, Minghao, et al.
Published: (2025)
MemGuard-Alpha: Detecting and Filtering Memorization-Contaminated Signals in LLM-Based Financial Forecasting via Membership Inference and Cross-Model Disagreement
by: Roy, Anisha, et al.
Published: (2026)
by: Roy, Anisha, et al.
Published: (2026)
Same Verdict, Different Reasons: LLM-as-a-Judge and Clinician Disagreement on Medical Chatbot Completeness
by: DeLucia, Alexandra, et al.
Published: (2026)
by: DeLucia, Alexandra, et al.
Published: (2026)
Calibrating Probabilistic Object Detectors with Annotator Disagreement
by: Tan, Zhi Qin, et al.
Published: (2026)
by: Tan, Zhi Qin, et al.
Published: (2026)
Dealing with Annotator Disagreement in Hate Speech Classification
by: Dehghan, Somaiyeh, et al.
Published: (2025)
by: Dehghan, Somaiyeh, et al.
Published: (2025)
Function-based Labels for Complementary Recommendation: Definition, Annotation, and LLM-as-a-Judge
by: Yamasaki, Chihiro, et al.
Published: (2025)
by: Yamasaki, Chihiro, et al.
Published: (2025)
Enhancing Textual Personality Detection toward Social Media: Integrating Long-term and Short-term Perspectives
by: Zhu, Haohao, et al.
Published: (2024)
by: Zhu, Haohao, et al.
Published: (2024)
Distinguishing Right from Wrong in Debates: Attribution Analysis of Chinese Harmful Memes
by: Wang, Weiming, et al.
Published: (2026)
by: Wang, Weiming, et al.
Published: (2026)
Detecting Offensive Memes with Social Biases in Singapore Context Using Multimodal Large Language Models
by: Yuxuan, Cao, et al.
Published: (2025)
by: Yuxuan, Cao, et al.
Published: (2025)
Detection and Analysis of Offensive Online Content in Hausa Language
by: Adam, Fatima Muhammad, et al.
Published: (2023)
by: Adam, Fatima Muhammad, et al.
Published: (2023)
Similar Items
-
Visual Puns from Idioms: An Iterative LLM-T2IM-MLLM Framework
by: Xiao, Kelaiti, et al.
Published: (2025) -
Aligning LLM Uncertainty with Human Disagreement in Subjectivity Analysis
by: Lu, Junyu, et al.
Published: (2026) -
VisualQuest: A Benchmark for Abstract Visual Reasoning in MLLMs
by: Xiao, Kelaiti, et al.
Published: (2025) -
ToxiCloakCN: Evaluating Robustness of Offensive Language Detection in Chinese with Cloaking Perturbations
by: Xiao, Yunze, et al.
Published: (2024) -
Multi-Agent VLMs Guided Self-Training with PNU Loss for Low-Resource Offensive Content Detection
by: Wang, Han, et al.
Published: (2025)