:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Ailing, Lei, Lina, Kong, Dehong, Wang, Zhixin, Xu, Jiaqi, Song, Fenglong, Guo, Chun-Le, Liu, Chang, Li, Fan, Chen, Jie
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2509.24427
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

YOSE: You Only Select Essential Tokens for Efficient DiT-based Video Object Removal
by: Wu, Chenyang, et al.
Published: (2026)

Dual Prompting Image Restoration with Diffusion Transformers
by: Kong, Dehong, et al.
Published: (2025)

HP-Edit: A Human-Preference Post-Training Framework for Image Editing
by: Li, Fan, et al.
Published: (2026)

UmniBench: Unified Understand and Generation Model Oriented Omni-dimensional Benchmark
by: Liu, Kai, et al.
Published: (2025)

SIV-Bench: A Video Benchmark for Social Interaction Understanding and Reasoning
by: Kong, Fanqi, et al.
Published: (2025)

UI-UG: A Unified MLLM for UI Understanding and Generation
by: Yang, Hao, et al.
Published: (2025)

1D-Bench: A Benchmark for Iterative UI Code Generation with Visual Feedback in Real-World
by: Xu, Qiao, et al.
Published: (2026)

ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems
by: Xue, Xiangyuan, et al.
Published: (2024)

T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation
by: Sun, Kaiyue, et al.
Published: (2024)

UI2Code^N: UI-to-Code Generation as Interactive Visual Optimization
by: Yang, Zhen, et al.
Published: (2025)

Macaron-A2UI: A Model for Generative UI in Personal Agents
by: Kong, Fancy, et al.
Published: (2026)

JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images
by: Wang, Zhecan, et al.
Published: (2024)

PocketSR: The Super-Resolution Expert in Your Pocket Mobiles
by: Sun, Haoze, et al.
Published: (2025)

RWKV-UI: UI Understanding with Enhanced Perception and Reasoning
by: Yang, Jiaxi, et al.
Published: (2025)

PlanViz: Evaluating Planning-Oriented Image Generation and Editing for Computer-Use Tasks
by: Li, Junxian, et al.
Published: (2026)

Urban Waterlogging Detection: A Challenging Benchmark and Large-Small Model Co-Adapter
by: Song, Suqi, et al.
Published: (2024)

MAIC-UI: Making Interactive Courseware with Generative UI
by: Tu, Shangqing, et al.
Published: (2026)

UI-Bench: A Benchmark for Evaluating Design Capabilities of AI Text-to-App Tools
by: Jung, Sam, et al.
Published: (2025)

FEDMEKI: A Benchmark for Scaling Medical Foundation Models via Federated Knowledge Injection
by: Wang, Jiaqi, et al.
Published: (2024)

CrowdGenUI: Aligning LLM-Based UI Generation with Crowdsourced User Preferences
by: Liu, Yimeng, et al.
Published: (2024)

LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation
by: Li, Jiacheng, et al.
Published: (2024)

Do-Undo Bench: Reversibility for Action Understanding in Image Generation
by: Mahajan, Shweta, et al.
Published: (2025)

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
by: You, Keen, et al.
Published: (2024)

WebVIA: A Web-based Vision-Language Agentic Framework for Interactive and Verifiable UI-to-Code Generation
by: Xu, Mingde, et al.
Published: (2025)

Electrospun Multiscale Structured Nanofibers for Lithium‐Based Batteries
by: Dehong Kong, et al.
Published: (2024)

ChartBench: A Benchmark for Complex Visual Reasoning in Charts
by: Xu, Zhengzhuo, et al.
Published: (2023)

MUIAnno: An Expert-Annotated Dataset and Evaluation Benchmark for Mobile UI Understanding
by: Parvez, Athar, et al.
Published: (2026)

Generative UI: LLMs are Effective UI Generators
by: Leviathan, Yaniv, et al.
Published: (2026)

MedHEval: Benchmarking Hallucinations and Mitigation Strategies in Medical Large Vision-Language Models
by: Chang, Aofei, et al.
Published: (2025)

Beyond Screenshots: Evaluating VLMs' Understanding of UI Animations
by: Liang, Chen, et al.
Published: (2026)

Electrochemically modulated single‐molecule localization microscopy for in vitro imaging cytoskeletal protein structures
by: Chenghong Lei, et al.
Published: (2025)

Rethinking Entropy Allocation in LLM-based ASR: Understanding the Dynamics between Speech Encoders and LLMs
by: Xie, Yuan, et al.
Published: (2026)

II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models
by: Liu, Ziqiang, et al.
Published: (2024)

PodAgent: A Comprehensive Framework for Podcast Generation
by: Xiao, Yujia, et al.
Published: (2025)

UniREditBench: A Unified Reasoning-based Image Editing Benchmark
by: Han, Feng, et al.
Published: (2025)

CineTechBench: A Benchmark for Cinematographic Technique Understanding and Generation
by: Wang, Xinran, et al.
Published: (2025)

Environmental Matching Attack Against Unmanned Aerial Vehicles Object Detection
by: Kong, Dehong, et al.
Published: (2024)

Video-Bench: Human-Aligned Video Generation Benchmark
by: Han, Hui, et al.
Published: (2025)

PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs
by: Zhang, Zixin, et al.
Published: (2025)

Falcon-UI: Understanding GUI Before Following User Instructions
by: Shen, Huawen, et al.
Published: (2024)