Saved in:
| Main Authors: | Rahman, Salman, Issaka, Sheriff, Suvarna, Ashima, Liu, Genglin, Shiffer, James, Lee, Jaeyoung, Parvez, Md Rizwan, Palangi, Hamid, Feng, Shi, Peng, Nanyun, Choi, Yejin, Michael, Julian, Jiang, Liwei, Gabriel, Saadia |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.02175 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents
by: Rahman, Salman, et al.
Published: (2025)
by: Rahman, Salman, et al.
Published: (2025)
ModelCitizens: Representing Community Voices in Online Safety
by: Suvarna, Ashima, et al.
Published: (2025)
by: Suvarna, Ashima, et al.
Published: (2025)
PhonologyBench: Evaluating Phonological Skills of Large Language Models
by: Suvarna, Ashima, et al.
Published: (2024)
by: Suvarna, Ashima, et al.
Published: (2024)
When Can LLMs Learn to Reason with Weak Supervision?
by: Rahman, Salman, et al.
Published: (2026)
by: Rahman, Salman, et al.
Published: (2026)
MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations
by: Liu, Genglin, et al.
Published: (2025)
by: Liu, Genglin, et al.
Published: (2025)
SUPERNOVA: Eliciting General Reasoning in LLMs with Reinforcement Learning on Natural Instructions
by: Suvarna, Ashima, et al.
Published: (2026)
by: Suvarna, Ashima, et al.
Published: (2026)
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team
by: Hosain, Md Tanzib, et al.
Published: (2025)
by: Hosain, Md Tanzib, et al.
Published: (2025)
Reward Engineering for Reinforcement Learning in Software Tasks
by: Masud, Md Rayhanul, et al.
Published: (2026)
by: Masud, Md Rayhanul, et al.
Published: (2026)
QUDSELECT: Selective Decoding for Questions Under Discussion Parsing
by: Suvarna, Ashima, et al.
Published: (2024)
by: Suvarna, Ashima, et al.
Published: (2024)
Chain of Evidences and Evidence to Generate: Prompting for Context Grounded and Retrieval Augmented Reasoning
by: Parvez, Md Rizwan
Published: (2024)
by: Parvez, Md Rizwan
Published: (2024)
Translation as a Scalable Proxy for Multilingual Evaluation
by: Issaka, Sheriff, et al.
Published: (2026)
by: Issaka, Sheriff, et al.
Published: (2026)
Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization
by: Bansal, Hritik, et al.
Published: (2024)
by: Bansal, Hritik, et al.
Published: (2024)
Multi-Objective Alignment of Language Models for Personalized Psychotherapy
by: Beikzadeh, Mehrab, et al.
Published: (2026)
by: Beikzadeh, Mehrab, et al.
Published: (2026)
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval
by: Paul, Dhiman, et al.
Published: (2024)
by: Paul, Dhiman, et al.
Published: (2024)
Courtroom-Style Multi-Agent Debate with Progressive RAG and Role-Switching for Controversial Claim Verification
by: Chowdhury, Masnun Nuha, et al.
Published: (2026)
by: Chowdhury, Masnun Nuha, et al.
Published: (2026)
A Survey on Agentic Security: Applications, Threats and Defenses
by: Shahriar, Asif, et al.
Published: (2025)
by: Shahriar, Asif, et al.
Published: (2025)
How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models
by: Lee, Jaeyoung, et al.
Published: (2024)
by: Lee, Jaeyoung, et al.
Published: (2024)
Exploring Group and Symmetry Principles in Large Language Models
by: Imani, Shima, et al.
Published: (2024)
by: Imani, Shima, et al.
Published: (2024)
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging
by: Islam, Md. Ashraful, et al.
Published: (2025)
by: Islam, Md. Ashraful, et al.
Published: (2025)
MapCoder: Multi-Agent Code Generation for Competitive Problem Solving
by: Islam, Md. Ashraful, et al.
Published: (2024)
by: Islam, Md. Ashraful, et al.
Published: (2024)
DataNarrative: Automated Data-Driven Storytelling with Visualizations and Texts
by: Islam, Mohammed Saidul, et al.
Published: (2024)
by: Islam, Mohammed Saidul, et al.
Published: (2024)
Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures
by: Yerukola, Akhila, et al.
Published: (2025)
by: Yerukola, Akhila, et al.
Published: (2025)
Self-Consistency from Only Two Samples: CoT-PoT Ensembling for Efficient LLM Reasoning
by: Saparkhan, Raman, et al.
Published: (2026)
by: Saparkhan, Raman, et al.
Published: (2026)
Omni-Modal Dissonance Benchmark: Systematically Breaking Modality Consensus to Probe Robustness and Calibrated Abstention
by: Nazi, Zabir Al, et al.
Published: (2026)
by: Nazi, Zabir Al, et al.
Published: (2026)
MapQaTor: An Extensible Framework for Efficient Annotation of Map-Based QA Datasets
by: Dihan, Mahir Labib, et al.
Published: (2024)
by: Dihan, Mahir Labib, et al.
Published: (2024)
Improving Event Definition Following For Zero-Shot Event Detection
by: Cai, Zefan, et al.
Published: (2024)
by: Cai, Zefan, et al.
Published: (2024)
DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
by: Chiu, Yu Ying, et al.
Published: (2024)
by: Chiu, Yu Ying, et al.
Published: (2024)
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models
by: Islam, Shayekh Bin, et al.
Published: (2024)
by: Islam, Shayekh Bin, et al.
Published: (2024)
TechniqueRAG: Retrieval Augmented Generation for Adversarial Technique Annotation in Cyber Threat Intelligence Text
by: Lekssays, Ahmed, et al.
Published: (2025)
by: Lekssays, Ahmed, et al.
Published: (2025)
The Art of Saying "Maybe": A Conformal Lens for Uncertainty Benchmarking in VLMs
by: Azad, Asif, et al.
Published: (2025)
by: Azad, Asif, et al.
Published: (2025)
CompassLLM: A Multi-Agent Approach toward Geo-Spatial Reasoning for Popular Path Query
by: Ananto, Md. Nazmul Islam, et al.
Published: (2025)
by: Ananto, Md. Nazmul Islam, et al.
Published: (2025)
Les partis politiques de l'opposition en Afrique
by: Souaré, Issaka
Published: (2019)
by: Souaré, Issaka
Published: (2019)
The Ghanaian NLP Landscape: A First Look
by: Issaka, Sheriff, et al.
Published: (2024)
by: Issaka, Sheriff, et al.
Published: (2024)
Can Language Models Reason about Individualistic Human Values and Preferences?
by: Jiang, Liwei, et al.
Published: (2024)
by: Jiang, Liwei, et al.
Published: (2024)
Mina: A Multilingual LLM-Powered Legal Assistant Agent for Bangladesh for Empowering Access to Justice
by: Wasi, Azmine Toushik, et al.
Published: (2025)
by: Wasi, Azmine Toushik, et al.
Published: (2025)
ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning
by: Masry, Ahmed, et al.
Published: (2024)
by: Masry, Ahmed, et al.
Published: (2024)
WebOperator: Action-Aware Tree Search for Autonomous Agents in Web Environment
by: Dihan, Mahir Labib, et al.
Published: (2025)
by: Dihan, Mahir Labib, et al.
Published: (2025)
Improving Language Models Trained on Translated Data with Continual Pre-Training and Dictionary Learning Analysis
by: Boughorbel, Sabri, et al.
Published: (2024)
by: Boughorbel, Sabri, et al.
Published: (2024)
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models
by: Dihan, Mahir Labib, et al.
Published: (2024)
by: Dihan, Mahir Labib, et al.
Published: (2024)
Effects of Lipopolysaccharide Core Modulation on Outer Membrane Protein Function and Virulence in Pectobacterium carotovorum
by: Yejin Park, et al.
Published: (2026)
by: Yejin Park, et al.
Published: (2026)
Similar Items
-
X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents
by: Rahman, Salman, et al.
Published: (2025) -
ModelCitizens: Representing Community Voices in Online Safety
by: Suvarna, Ashima, et al.
Published: (2025) -
PhonologyBench: Evaluating Phonological Skills of Large Language Models
by: Suvarna, Ashima, et al.
Published: (2024) -
When Can LLMs Learn to Reason with Weak Supervision?
by: Rahman, Salman, et al.
Published: (2026) -
MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations
by: Liu, Genglin, et al.
Published: (2025)