Saved in:
| Main Authors: | Shi, Wenlei, Jin, Xing |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.10337 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Process Supervision-Guided Policy Optimization for Code Generation
by: Dai, Ning, et al.
Published: (2024)
by: Dai, Ning, et al.
Published: (2024)
RomanLens: The Role Of Latent Romanization In Multilinguality In LLMs
by: Saji, Alan, et al.
Published: (2025)
by: Saji, Alan, et al.
Published: (2025)
Text-Based Approaches to Item Difficulty Modeling in Large-Scale Assessments: A Systematic Review
by: Peters, Sydney, et al.
Published: (2025)
by: Peters, Sydney, et al.
Published: (2025)
What Makes a Good AI Review? Concern-Level Diagnostics for AI Peer Review
by: Jin, Ming
Published: (2026)
by: Jin, Ming
Published: (2026)
Evaluating the Efficacy of Hybrid Deep Learning Models in Distinguishing AI-Generated Text
by: Oketunji, Abiodun Finbarrs
Published: (2023)
by: Oketunji, Abiodun Finbarrs
Published: (2023)
LLMs taking shortcuts in test generation: A study with SAP HANA and LevelDB
by: Bekmyradov, Vekil, et al.
Published: (2026)
by: Bekmyradov, Vekil, et al.
Published: (2026)
Identifying Bias in Machine-generated Text Detection
by: Stowe, Kevin, et al.
Published: (2025)
by: Stowe, Kevin, et al.
Published: (2025)
Curveball Steering: The Right Direction To Steer Isn't Always Linear
by: Raval, Shivam, et al.
Published: (2026)
by: Raval, Shivam, et al.
Published: (2026)
Analyzing Large language models chatbots: An experimental approach using a probability test
by: Peruchini, Melise, et al.
Published: (2024)
by: Peruchini, Melise, et al.
Published: (2024)
SQLord: A Robust Enterprise Text-to-SQL Solution via Reverse Data Generation and Workflow Decomposition
by: Cheng, Song, et al.
Published: (2025)
by: Cheng, Song, et al.
Published: (2025)
Learning Software Bug Reports: A Systematic Literature Review
by: Long, Guoming, et al.
Published: (2025)
by: Long, Guoming, et al.
Published: (2025)
Deciphering Digital Detectives: Understanding LLM Behaviors and Capabilities in Multi-Agent Mystery Games
by: Wu, Dekun, et al.
Published: (2023)
by: Wu, Dekun, et al.
Published: (2023)
A Use-Case Specific Dataset for Measuring Dimensions of Responsible Performance in LLM-generated Text
by: Sagae, Alicia, et al.
Published: (2025)
by: Sagae, Alicia, et al.
Published: (2025)
Active Context Compression: Autonomous Memory Management in LLM Agents
by: Verma, Nikhil
Published: (2026)
by: Verma, Nikhil
Published: (2026)
PetKaz at SemEval-2024 Task 8: Can Linguistics Capture the Specifics of LLM-generated Text?
by: Petukhova, Kseniia, et al.
Published: (2024)
by: Petukhova, Kseniia, et al.
Published: (2024)
SOCIA-Nabla: Textual Gradient Meets Multi-Agent Orchestration for Automated Simulator Generation
by: Hua, Yuncheng, et al.
Published: (2025)
by: Hua, Yuncheng, et al.
Published: (2025)
ALAS: A Stateful Multi-LLM Agent Framework for Disruption-Aware Planning
by: Chang, Edward Y., et al.
Published: (2025)
by: Chang, Edward Y., et al.
Published: (2025)
Evaluating Steering Techniques using Human Similarity Judgments
by: Studdiford, Zach, et al.
Published: (2025)
by: Studdiford, Zach, et al.
Published: (2025)
Reasoning-Based AI for Startup Evaluation (R.A.I.S.E.): A Memory-Augmented, Multi-Step Decision Framework
by: Preuveneers, Jack, et al.
Published: (2025)
by: Preuveneers, Jack, et al.
Published: (2025)
AI-Powered Annotation Pipelines for Stabilizing Large Language Models: A Human-AI Synergy Approach
by: Pathak, Gangesh, et al.
Published: (2025)
by: Pathak, Gangesh, et al.
Published: (2025)
LLMs are Capable of Misaligned Behavior Under Explicit Prohibition and Surveillance
by: Ivanov, Igor
Published: (2025)
by: Ivanov, Igor
Published: (2025)
From Extraction to Synthesis: Entangled Heuristics for Agent-Augmented Strategic Reasoning
by: Ghisellini, Renato, et al.
Published: (2025)
by: Ghisellini, Renato, et al.
Published: (2025)
Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs
by: Xu, Chenjun, et al.
Published: (2025)
by: Xu, Chenjun, et al.
Published: (2025)
LLM-based Automated Theorem Proving Hinges on Scalable Synthetic Data Generation
by: Lai, Junyu, et al.
Published: (2025)
by: Lai, Junyu, et al.
Published: (2025)
The Unified Cognitive Consciousness Theory for Language Models: Anchoring Semantics, Thresholds of Activation, and Emergent Reasoning
by: Chang, Edward Y., et al.
Published: (2025)
by: Chang, Edward Y., et al.
Published: (2025)
Learning Temporal Abstractions via Variational Homomorphisms in Option-Induced Abstract MDPs
by: Li, Chang, et al.
Published: (2025)
by: Li, Chang, et al.
Published: (2025)
SOCIA-$\nabla$: Textual Gradient Meets Multi-Agent Orchestration for Automated Simulator Generation
by: Hua, Yuncheng, et al.
Published: (2025)
by: Hua, Yuncheng, et al.
Published: (2025)
A Library of LLM Intrinsics for Retrieval-Augmented Generation
by: Danilevsky, Marina, et al.
Published: (2025)
by: Danilevsky, Marina, et al.
Published: (2025)
Evaluating LLM Metrics Through Real-World Capabilities
by: Miller, Justin K, et al.
Published: (2025)
by: Miller, Justin K, et al.
Published: (2025)
A Fuzzy Logic Prompting Framework for Large Language Models in Adaptive and Uncertain Tasks
by: Figueiredo, Vanessa
Published: (2025)
by: Figueiredo, Vanessa
Published: (2025)
Pharos-ESG: A Framework for Multimodal Parsing, Contextual Narration, and Hierarchical Labeling of ESG Report
by: Chen, Yan, et al.
Published: (2025)
by: Chen, Yan, et al.
Published: (2025)
PuzzleClone: A DSL-Powered Framework for Synthesizing Verifiable Data
by: Xiong, Kai, et al.
Published: (2025)
by: Xiong, Kai, et al.
Published: (2025)
Argumentatively Coherent Judgmental Forecasting
by: Gorur, Deniz, et al.
Published: (2025)
by: Gorur, Deniz, et al.
Published: (2025)
SagaLLM: Context Management, Validation, and Transaction Guarantees for Multi-Agent LLM Planning
by: Chang, Edward Y., et al.
Published: (2025)
by: Chang, Edward Y., et al.
Published: (2025)
MapAgent: A Hierarchical Agent for Geospatial Reasoning with Dynamic Map Tool Integration
by: Hasan, Md Hasebul, et al.
Published: (2025)
by: Hasan, Md Hasebul, et al.
Published: (2025)
Learning Efficient Guardrails for Compliance
by: Wen, Xiaofei, et al.
Published: (2025)
by: Wen, Xiaofei, et al.
Published: (2025)
Unlocking the Wisdom of Large Language Models: An Introduction to The Path to Artificial General Intelligence
by: Chang, Edward Y.
Published: (2024)
by: Chang, Edward Y.
Published: (2024)
Understanding LLM Evaluator Behavior: A Structured Multi-Evaluator Framework for Merchant Risk Assessment
by: Wang, Liang, et al.
Published: (2026)
by: Wang, Liang, et al.
Published: (2026)
RADD: Retrieval-Augmented Discrete Diffusion for Multi-Modal Knowledge Graph Completion
by: Niu, Guanglin, et al.
Published: (2026)
by: Niu, Guanglin, et al.
Published: (2026)
Quantifying Self-Preservation Bias in Large Language Models
by: Migliarini, Matteo, et al.
Published: (2026)
by: Migliarini, Matteo, et al.
Published: (2026)
Similar Items
-
Process Supervision-Guided Policy Optimization for Code Generation
by: Dai, Ning, et al.
Published: (2024) -
RomanLens: The Role Of Latent Romanization In Multilinguality In LLMs
by: Saji, Alan, et al.
Published: (2025) -
Text-Based Approaches to Item Difficulty Modeling in Large-Scale Assessments: A Systematic Review
by: Peters, Sydney, et al.
Published: (2025) -
What Makes a Good AI Review? Concern-Level Diagnostics for AI Peer Review
by: Jin, Ming
Published: (2026) -
Evaluating the Efficacy of Hybrid Deep Learning Models in Distinguishing AI-Generated Text
by: Oketunji, Abiodun Finbarrs
Published: (2023)