Saved in:
| Main Authors: | Zhou, Wei, Huang, Hong, Zhang, Guowen, Shi, Ruize, Yin, Kehan, Lin, Yuanyuan, Liu, Bang |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.04598 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Scalable Heterogeneous Graph Learning via Heterogeneous-aware Orthogonal Prototype Experts
by: Zhou, Wei, et al.
Published: (2026)
by: Zhou, Wei, et al.
Published: (2026)
From General to Specific: Tailoring Large Language Models for Personalized Healthcare
by: Shi, Ruize, et al.
Published: (2024)
by: Shi, Ruize, et al.
Published: (2024)
Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values
by: Dong, Haonan, et al.
Published: (2026)
by: Dong, Haonan, et al.
Published: (2026)
Comprehensive Review and Empirical Evaluation of Causal Discovery Algorithms for Numerical Data
by: Niu, Wenjin, et al.
Published: (2024)
by: Niu, Wenjin, et al.
Published: (2024)
Causally-Enhanced Reinforcement Policy Optimization
by: Wang, Xiangqi, et al.
Published: (2025)
by: Wang, Xiangqi, et al.
Published: (2025)
Revisiting Few-Shot Learning from a Causal Perspective
by: Lin, Guoliang, et al.
Published: (2022)
by: Lin, Guoliang, et al.
Published: (2022)
JAILJUDGE: A Comprehensive Jailbreak Judge Benchmark with Multi-Agent Enhanced Explanation Evaluation Framework
by: Liu, Fan, et al.
Published: (2024)
by: Liu, Fan, et al.
Published: (2024)
Causal Discovery as Dialectical Aggregation: A Quantitative Argumentation Framework
by: Wei, Sheng, et al.
Published: (2026)
by: Wei, Sheng, et al.
Published: (2026)
Hybrid Local Causal Discovery
by: Ling, Zhaolong, et al.
Published: (2024)
by: Ling, Zhaolong, et al.
Published: (2024)
Is Your VLM for Autonomous Driving Safety-Ready? A Comprehensive Benchmark for Evaluating External and In-Cabin Risks
by: Meng, Xianhui, et al.
Published: (2025)
by: Meng, Xianhui, et al.
Published: (2025)
Evaluating Progress in Graph Foundation Models: A Comprehensive Benchmark and New Insights
by: Yu, Xingtong, et al.
Published: (2026)
by: Yu, Xingtong, et al.
Published: (2026)
DMCD: Semantic-Statistical Framework for Causal Discovery
by: KaPatel, Samarth, et al.
Published: (2026)
by: KaPatel, Samarth, et al.
Published: (2026)
ACCESS : A Benchmark for Abstract Causal Event Discovery and Reasoning
by: Vo, Vy, et al.
Published: (2025)
by: Vo, Vy, et al.
Published: (2025)
SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents
by: Zhang, Ziao, et al.
Published: (2026)
by: Zhang, Ziao, et al.
Published: (2026)
AMSbench: A Comprehensive Benchmark for Evaluating MLLM Capabilities in AMS Circuits
by: Shi, Yichen, et al.
Published: (2025)
by: Shi, Yichen, et al.
Published: (2025)
Revolutionizing Database Q&A with Large Language Models: Comprehensive Benchmark and Evaluation
by: Zheng, Yihang, et al.
Published: (2024)
by: Zheng, Yihang, et al.
Published: (2024)
Neural Information Causality
by: Bang, Jeongho, et al.
Published: (2026)
by: Bang, Jeongho, et al.
Published: (2026)
UltraEval-Audio: A Unified Framework for Comprehensive Evaluation of Audio Foundation Models
by: Shi, Qundong, et al.
Published: (2026)
by: Shi, Qundong, et al.
Published: (2026)
Federated Causal Discovery from Heterogeneous Data
by: Li, Loka, et al.
Published: (2024)
by: Li, Loka, et al.
Published: (2024)
Temporal Latent Variable Structural Causal Model for Causal Discovery under External Interferences
by: Cai, Ruichu, et al.
Published: (2025)
by: Cai, Ruichu, et al.
Published: (2025)
WebCoderBench: Benchmarking Web Application Generation with Comprehensive and Interpretable Evaluation Metrics
by: Liu, Chenxu, et al.
Published: (2026)
by: Liu, Chenxu, et al.
Published: (2026)
Challenges and Considerations in the Evaluation of Bayesian Causal Discovery
by: Mamaghan, Amir Mohammad Karimi, et al.
Published: (2024)
by: Mamaghan, Amir Mohammad Karimi, et al.
Published: (2024)
Dependency-based Anomaly Detection: a General Framework and Comprehensive Evaluation
by: Lu, Sha, et al.
Published: (2020)
by: Lu, Sha, et al.
Published: (2020)
SoK: a Comprehensive Causality Analysis Framework for Large Language Model Security
by: Zhao, Wei, et al.
Published: (2025)
by: Zhao, Wei, et al.
Published: (2025)
CausalReasoningBenchmark: A Real-World Benchmark for Disentangled Evaluation of Causal Identification and Estimation
by: Sawarni, Ayush, et al.
Published: (2026)
by: Sawarni, Ayush, et al.
Published: (2026)
TravelEval: A Comprehensive Benchmarking Framework for Evaluating LLM-Powered Travel Planning Agents
by: Chen, Weiyi, et al.
Published: (2026)
by: Chen, Weiyi, et al.
Published: (2026)
What Would Happen Next? Predicting Consequences from An Event Causality Graph
by: Zhan, Chuanhong, et al.
Published: (2024)
by: Zhan, Chuanhong, et al.
Published: (2024)
InsightVision: A Comprehensive, Multi-Level Chinese-based Benchmark for Evaluating Implicit Visual Semantics in Large Vision Language Models
by: Yin, Xiaofei, et al.
Published: (2025)
by: Yin, Xiaofei, et al.
Published: (2025)
Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models
by: Lin, Zihao, et al.
Published: (2024)
by: Lin, Zihao, et al.
Published: (2024)
SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
by: Chen, Jingxuan, et al.
Published: (2024)
by: Chen, Jingxuan, et al.
Published: (2024)
Differentiable Constraint-Based Causal Discovery
by: Zhou, Jincheng, et al.
Published: (2025)
by: Zhou, Jincheng, et al.
Published: (2025)
MMCircuitEval: A Comprehensive Multimodal Circuit-Focused Benchmark for Evaluating LLMs
by: Zhao, Chenchen, et al.
Published: (2025)
by: Zhao, Chenchen, et al.
Published: (2025)
Revisiting, Benchmarking and Understanding Unsupervised Graph Domain Adaptation
by: Liu, Meihan, et al.
Published: (2024)
by: Liu, Meihan, et al.
Published: (2024)
Argumentative Causal Discovery
by: Russo, Fabrizio, et al.
Published: (2024)
by: Russo, Fabrizio, et al.
Published: (2024)
ELABORATION: A Comprehensive Benchmark on Human-LLM Competitive Programming
by: Yang, Xinwei, et al.
Published: (2025)
by: Yang, Xinwei, et al.
Published: (2025)
MedEthicsQA: A Comprehensive Question Answering Benchmark for Medical Ethics Evaluation of LLMs
by: Wei, Jianhui, et al.
Published: (2025)
by: Wei, Jianhui, et al.
Published: (2025)
OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
by: Qian, Zekun, et al.
Published: (2024)
by: Qian, Zekun, et al.
Published: (2024)
MFE-ETP: A Comprehensive Evaluation Benchmark for Multi-modal Foundation Models on Embodied Task Planning
by: Zhang, Min, et al.
Published: (2024)
by: Zhang, Min, et al.
Published: (2024)
Benchmarking LLMs for Pairwise Causal Discovery in Biomedical and Multi-Domain Contexts
by: Anuyah, Sydney, et al.
Published: (2026)
by: Anuyah, Sydney, et al.
Published: (2026)
UrbanPlanBench: A Comprehensive Urban Planning Benchmark for Evaluating Large Language Models
by: Zheng, Yu, et al.
Published: (2025)
by: Zheng, Yu, et al.
Published: (2025)
Similar Items
-
Scalable Heterogeneous Graph Learning via Heterogeneous-aware Orthogonal Prototype Experts
by: Zhou, Wei, et al.
Published: (2026) -
From General to Specific: Tailoring Large Language Models for Personalized Healthcare
by: Shi, Ruize, et al.
Published: (2024) -
Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values
by: Dong, Haonan, et al.
Published: (2026) -
Comprehensive Review and Empirical Evaluation of Causal Discovery Algorithms for Numerical Data
by: Niu, Wenjin, et al.
Published: (2024) -
Causally-Enhanced Reinforcement Policy Optimization
by: Wang, Xiangqi, et al.
Published: (2025)