Saved in:
| Main Authors: | Zhang, Ailing, Lei, Lina, Kong, Dehong, Wang, Zhixin, Xu, Jiaqi, Song, Fenglong, Guo, Chun-Le, Liu, Chang, Li, Fan, Chen, Jie |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.24427 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
YOSE: You Only Select Essential Tokens for Efficient DiT-based Video Object Removal
by: Wu, Chenyang, et al.
Published: (2026)
by: Wu, Chenyang, et al.
Published: (2026)
Dual Prompting Image Restoration with Diffusion Transformers
by: Kong, Dehong, et al.
Published: (2025)
by: Kong, Dehong, et al.
Published: (2025)
HP-Edit: A Human-Preference Post-Training Framework for Image Editing
by: Li, Fan, et al.
Published: (2026)
by: Li, Fan, et al.
Published: (2026)
UmniBench: Unified Understand and Generation Model Oriented Omni-dimensional Benchmark
by: Liu, Kai, et al.
Published: (2025)
by: Liu, Kai, et al.
Published: (2025)
SIV-Bench: A Video Benchmark for Social Interaction Understanding and Reasoning
by: Kong, Fanqi, et al.
Published: (2025)
by: Kong, Fanqi, et al.
Published: (2025)
UI-UG: A Unified MLLM for UI Understanding and Generation
by: Yang, Hao, et al.
Published: (2025)
by: Yang, Hao, et al.
Published: (2025)
1D-Bench: A Benchmark for Iterative UI Code Generation with Visual Feedback in Real-World
by: Xu, Qiao, et al.
Published: (2026)
by: Xu, Qiao, et al.
Published: (2026)
ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems
by: Xue, Xiangyuan, et al.
Published: (2024)
by: Xue, Xiangyuan, et al.
Published: (2024)
T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation
by: Sun, Kaiyue, et al.
Published: (2024)
by: Sun, Kaiyue, et al.
Published: (2024)
UI2Code^N: UI-to-Code Generation as Interactive Visual Optimization
by: Yang, Zhen, et al.
Published: (2025)
by: Yang, Zhen, et al.
Published: (2025)
Macaron-A2UI: A Model for Generative UI in Personal Agents
by: Kong, Fancy, et al.
Published: (2026)
by: Kong, Fancy, et al.
Published: (2026)
JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images
by: Wang, Zhecan, et al.
Published: (2024)
by: Wang, Zhecan, et al.
Published: (2024)
PocketSR: The Super-Resolution Expert in Your Pocket Mobiles
by: Sun, Haoze, et al.
Published: (2025)
by: Sun, Haoze, et al.
Published: (2025)
RWKV-UI: UI Understanding with Enhanced Perception and Reasoning
by: Yang, Jiaxi, et al.
Published: (2025)
by: Yang, Jiaxi, et al.
Published: (2025)
PlanViz: Evaluating Planning-Oriented Image Generation and Editing for Computer-Use Tasks
by: Li, Junxian, et al.
Published: (2026)
by: Li, Junxian, et al.
Published: (2026)
Urban Waterlogging Detection: A Challenging Benchmark and Large-Small Model Co-Adapter
by: Song, Suqi, et al.
Published: (2024)
by: Song, Suqi, et al.
Published: (2024)
MAIC-UI: Making Interactive Courseware with Generative UI
by: Tu, Shangqing, et al.
Published: (2026)
by: Tu, Shangqing, et al.
Published: (2026)
UI-Bench: A Benchmark for Evaluating Design Capabilities of AI Text-to-App Tools
by: Jung, Sam, et al.
Published: (2025)
by: Jung, Sam, et al.
Published: (2025)
FEDMEKI: A Benchmark for Scaling Medical Foundation Models via Federated Knowledge Injection
by: Wang, Jiaqi, et al.
Published: (2024)
by: Wang, Jiaqi, et al.
Published: (2024)
CrowdGenUI: Aligning LLM-Based UI Generation with Crowdsourced User Preferences
by: Liu, Yimeng, et al.
Published: (2024)
by: Liu, Yimeng, et al.
Published: (2024)
LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation
by: Li, Jiacheng, et al.
Published: (2024)
by: Li, Jiacheng, et al.
Published: (2024)
Do-Undo Bench: Reversibility for Action Understanding in Image Generation
by: Mahajan, Shweta, et al.
Published: (2025)
by: Mahajan, Shweta, et al.
Published: (2025)
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
by: You, Keen, et al.
Published: (2024)
by: You, Keen, et al.
Published: (2024)
WebVIA: A Web-based Vision-Language Agentic Framework for Interactive and Verifiable UI-to-Code Generation
by: Xu, Mingde, et al.
Published: (2025)
by: Xu, Mingde, et al.
Published: (2025)
Electrospun Multiscale Structured Nanofibers for Lithium‐Based Batteries
by: Dehong Kong, et al.
Published: (2024)
by: Dehong Kong, et al.
Published: (2024)
ChartBench: A Benchmark for Complex Visual Reasoning in Charts
by: Xu, Zhengzhuo, et al.
Published: (2023)
by: Xu, Zhengzhuo, et al.
Published: (2023)
MUIAnno: An Expert-Annotated Dataset and Evaluation Benchmark for Mobile UI Understanding
by: Parvez, Athar, et al.
Published: (2026)
by: Parvez, Athar, et al.
Published: (2026)
Generative UI: LLMs are Effective UI Generators
by: Leviathan, Yaniv, et al.
Published: (2026)
by: Leviathan, Yaniv, et al.
Published: (2026)
MedHEval: Benchmarking Hallucinations and Mitigation Strategies in Medical Large Vision-Language Models
by: Chang, Aofei, et al.
Published: (2025)
by: Chang, Aofei, et al.
Published: (2025)
Beyond Screenshots: Evaluating VLMs' Understanding of UI Animations
by: Liang, Chen, et al.
Published: (2026)
by: Liang, Chen, et al.
Published: (2026)
Electrochemically modulated single‐molecule localization microscopy for in vitro imaging cytoskeletal protein structures
by: Chenghong Lei, et al.
Published: (2025)
by: Chenghong Lei, et al.
Published: (2025)
Rethinking Entropy Allocation in LLM-based ASR: Understanding the Dynamics between Speech Encoders and LLMs
by: Xie, Yuan, et al.
Published: (2026)
by: Xie, Yuan, et al.
Published: (2026)
II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models
by: Liu, Ziqiang, et al.
Published: (2024)
by: Liu, Ziqiang, et al.
Published: (2024)
PodAgent: A Comprehensive Framework for Podcast Generation
by: Xiao, Yujia, et al.
Published: (2025)
by: Xiao, Yujia, et al.
Published: (2025)
UniREditBench: A Unified Reasoning-based Image Editing Benchmark
by: Han, Feng, et al.
Published: (2025)
by: Han, Feng, et al.
Published: (2025)
CineTechBench: A Benchmark for Cinematographic Technique Understanding and Generation
by: Wang, Xinran, et al.
Published: (2025)
by: Wang, Xinran, et al.
Published: (2025)
Environmental Matching Attack Against Unmanned Aerial Vehicles Object Detection
by: Kong, Dehong, et al.
Published: (2024)
by: Kong, Dehong, et al.
Published: (2024)
Video-Bench: Human-Aligned Video Generation Benchmark
by: Han, Hui, et al.
Published: (2025)
by: Han, Hui, et al.
Published: (2025)
PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs
by: Zhang, Zixin, et al.
Published: (2025)
by: Zhang, Zixin, et al.
Published: (2025)
Falcon-UI: Understanding GUI Before Following User Instructions
by: Shen, Huawen, et al.
Published: (2024)
by: Shen, Huawen, et al.
Published: (2024)
Similar Items
-
YOSE: You Only Select Essential Tokens for Efficient DiT-based Video Object Removal
by: Wu, Chenyang, et al.
Published: (2026) -
Dual Prompting Image Restoration with Diffusion Transformers
by: Kong, Dehong, et al.
Published: (2025) -
HP-Edit: A Human-Preference Post-Training Framework for Image Editing
by: Li, Fan, et al.
Published: (2026) -
UmniBench: Unified Understand and Generation Model Oriented Omni-dimensional Benchmark
by: Liu, Kai, et al.
Published: (2025) -
SIV-Bench: A Video Benchmark for Social Interaction Understanding and Reasoning
by: Kong, Fanqi, et al.
Published: (2025)