Saved in:
| Main Authors: | Liu, Ruibo, Wei, Jerry, Liu, Fangyu, Si, Chenglei, Zhang, Yanzhe, Rao, Jinmeng, Zheng, Steven, Peng, Daiyi, Yang, Diyi, Zhou, Denny, Dai, Andrew M. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.07503 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering
by: Si, Chenglei, et al.
Published: (2024)
by: Si, Chenglei, et al.
Published: (2024)
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
by: Si, Chenglei, et al.
Published: (2024)
by: Si, Chenglei, et al.
Published: (2024)
The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas
by: Si, Chenglei, et al.
Published: (2025)
by: Si, Chenglei, et al.
Published: (2025)
Higher Layers Need More LoRA Experts
by: Gao, Chongyang, et al.
Published: (2024)
by: Gao, Chongyang, et al.
Published: (2024)
Searching for Privacy Risks in LLM Agents via Simulation
by: Zhang, Yanzhe, et al.
Published: (2025)
by: Zhang, Yanzhe, et al.
Published: (2025)
Scaling Tumor Segmentation: Best Lessons from Real and Synthetic Data
by: Chen, Qi, et al.
Published: (2025)
by: Chen, Qi, et al.
Published: (2025)
A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration
by: Liu, Zijun, et al.
Published: (2023)
by: Liu, Zijun, et al.
Published: (2023)
Knowing You Don't Know: Learning When to Continue Search in Multi-round RAG through Self-Practicing
by: Yang, Diji, et al.
Published: (2025)
by: Yang, Diji, et al.
Published: (2025)
Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping
by: Li, Ryan, et al.
Published: (2024)
by: Li, Ryan, et al.
Published: (2024)
Attacking Vision-Language Computer Agents via Pop-ups
by: Zhang, Yanzhe, et al.
Published: (2024)
by: Zhang, Yanzhe, et al.
Published: (2024)
Towards Execution-Grounded Automated AI Research
by: Si, Chenglei, et al.
Published: (2026)
by: Si, Chenglei, et al.
Published: (2026)
Auditing Gender Presentation Differences in Text-to-Image Models
by: Zhang, Yanzhe, et al.
Published: (2023)
by: Zhang, Yanzhe, et al.
Published: (2023)
Long-form factuality in large language models
by: Wei, Jerry, et al.
Published: (2024)
by: Wei, Jerry, et al.
Published: (2024)
Distilling an End-to-End Voice Assistant Without Instruction Training Data
by: Held, William, et al.
Published: (2024)
by: Held, William, et al.
Published: (2024)
Highly Efficient and Stable Narrow Band Green Emitting Phosphor of Sb 3+ /Ce 3+ Sensitized Cs 2 NaTbCl 6 for WLED
by: Changheng Chen, et al.
Published: (2024)
by: Changheng Chen, et al.
Published: (2024)
Contextual Experience Replay for Self-Improvement of Language Agents
by: Liu, Yitao, et al.
Published: (2025)
by: Liu, Yitao, et al.
Published: (2025)
Generative Interfaces for Language Models
by: Chen, Jiaqi, et al.
Published: (2025)
by: Chen, Jiaqi, et al.
Published: (2025)
Real-Time Reasoning Agents in Evolving Environments
by: Wen, Yule, et al.
Published: (2025)
by: Wen, Yule, et al.
Published: (2025)
Type-Compliant Adaptation Cascades: Adapting Programmatic LM Workflows to Data
by: Lin, Chu-Cheng, et al.
Published: (2025)
by: Lin, Chu-Cheng, et al.
Published: (2025)
Borderline content and platformised speech governance: Mapping TikTok's moderation controversies in South and Southeast Asia
by: Diyi Liu
Published: (2024)
by: Diyi Liu
Published: (2024)
SPHERE: An Evaluation Card for Human-AI Systems
by: Ma, Qianou, et al.
Published: (2025)
by: Ma, Qianou, et al.
Published: (2025)
Defect‐Engineered Zero‐Dimensional Perovskite Cs 3 LuCl 6 : Tb 3+ Scintillator with Exceptional Thermal Stability for Flexible High‐Temperature X‐Ray Imaging
by: Ruibo Gao, et al.
Published: (2026)
by: Ruibo Gao, et al.
Published: (2026)
Achieving Single‐Phased Full Visible Spectrum Broadband White Emission in Ag⁺, Bi 3 ⁺, and Sb 3 ⁺ Tri‐Doped Cs₂NaLuCl₆ Double Perovskite Phosphor
by: Changheng Chen, et al.
Published: (2025)
by: Changheng Chen, et al.
Published: (2025)
The Best Instruction-Tuning Data are Those That Fit
by: Zhang, Dylan, et al.
Published: (2025)
by: Zhang, Dylan, et al.
Published: (2025)
LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding
by: Zhang, Yanzhe, et al.
Published: (2023)
by: Zhang, Yanzhe, et al.
Published: (2023)
Contextualized Privacy Defense for LLM Agents
by: Wen, Yule, et al.
Published: (2026)
by: Wen, Yule, et al.
Published: (2026)
Security and Innovation in ERP Systems: Best Practices for AI, OIC, and Automation Integration
by: Sreenivasa Rao Sola
Published: (2023)
by: Sreenivasa Rao Sola
Published: (2023)
Simple synthetic data reduces sycophancy in large language models
by: Wei, Jerry, et al.
Published: (2023)
by: Wei, Jerry, et al.
Published: (2023)
Challenges and Best Practices in Corporate AI Governance:Lessons from the Biopharmaceutical Industry
by: Mökander, Jakob, et al.
Published: (2024)
by: Mökander, Jakob, et al.
Published: (2024)
When to Showcase Automated Production Processes? Disclosing Production Processes Increases Evaluation of Low‐End but Decreases Evaluation of High‐End Products
by: Diyi Liu, et al.
Published: (2025)
by: Diyi Liu, et al.
Published: (2025)
Deploying Tiny LVLM Judges for Real-World Evaluation of Chart Models: Lessons Learned and Best Practices
by: Laskar, Md Tahmid Rahman, et al.
Published: (2025)
by: Laskar, Md Tahmid Rahman, et al.
Published: (2025)
Selecting the Best Optimizing System
by: Si, Nian, et al.
Published: (2022)
by: Si, Nian, et al.
Published: (2022)
Robust Output Regulation of Uncertain Linear Time-Varying Systems
by: Zha, Jinmeng, et al.
Published: (2026)
by: Zha, Jinmeng, et al.
Published: (2026)
AutoMetrics: Approximate Human Judgements with Automatically Generated Evaluators
by: Ryan, Michael J., et al.
Published: (2025)
by: Ryan, Michael J., et al.
Published: (2025)
S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models
by: Lei, Fangyu, et al.
Published: (2023)
by: Lei, Fangyu, et al.
Published: (2023)
Evaluation on Aggregate Particle Spalling of Induction Heating‐Based Functional Ultra‐Thin Friction Layer Using Image Processing Based on MATLAB
by: Zhengmengyuan Rao, et al.
Published: (2026)
by: Zhengmengyuan Rao, et al.
Published: (2026)
IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues
by: Yang, Diji, et al.
Published: (2024)
by: Yang, Diji, et al.
Published: (2024)
Tweedie Regression for Video Recommendation System
by: Zheng, Yan, et al.
Published: (2025)
by: Zheng, Yan, et al.
Published: (2025)
Relic abundance of dark matter with coannihilation in non-standard cosmological scenarios
by: Liu, Fangyu, et al.
Published: (2023)
by: Liu, Fangyu, et al.
Published: (2023)
Constraints on Asymmetric Dark Matter Self Annihilation Cross Sections in Non-standard Cosmological Scenarios
by: Liu, Fangyu, et al.
Published: (2023)
by: Liu, Fangyu, et al.
Published: (2023)
Similar Items
-
Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering
by: Si, Chenglei, et al.
Published: (2024) -
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
by: Si, Chenglei, et al.
Published: (2024) -
The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas
by: Si, Chenglei, et al.
Published: (2025) -
Higher Layers Need More LoRA Experts
by: Gao, Chongyang, et al.
Published: (2024) -
Searching for Privacy Risks in LLM Agents via Simulation
by: Zhang, Yanzhe, et al.
Published: (2025)