:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhao, Yang, Xu, Yanwu, Xiao, Zhisheng, Jia, Haolin, Hou, Tingbo
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2311.16567
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MobileViCLIP: An Efficient Video-Text Model for Mobile Devices
by: Yang, Min, et al.
Published: (2025)

Mobile-VTON: High-Fidelity On-Device Virtual Try-On
by: Wan, Zhenchen, et al.
Published: (2026)

SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training
by: Hu, Dongting, et al.
Published: (2024)

Instant Preference Alignment for Text-to-Image Diffusion Models
by: Li, Yang, et al.
Published: (2025)

On-device Sora: Enabling Training-Free Diffusion-based Text-to-Video Generation for Mobile Devices
by: Kim, Bosung, et al.
Published: (2025)

On-device Sora: Enabling Training-Free Diffusion-based Text-to-Video Generation for Mobile Devices
by: Kim, Bosung, et al.
Published: (2025)

Mobile-O: Unified Multimodal Understanding and Generation on Mobile Device
by: Shaker, Abdelrahman, et al.
Published: (2026)

ElasticDiT: Efficient Diffusion Transformers via Elastic Architecture and Sparse Attention for High-Resolution Image Generation on Mobile Devices
by: Du, Kunpeng, et al.
Published: (2026)

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation
by: Wang, Haofan, et al.
Published: (2024)

InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation
by: Wang, Haofan, et al.
Published: (2024)

MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
by: Chu, Xiangxiang, et al.
Published: (2023)

MobileI2V: Fast and High-Resolution Image-to-Video on Mobile Devices
by: Zhang, Shuai, et al.
Published: (2025)

NanoFLUX: Distillation-Driven Compression of Large Text-to-Image Generation Models for Mobile Devices
by: Chavhan, Ruchika, et al.
Published: (2026)

Instant3D: Instant Text-to-3D Generation
by: Li, Ming, et al.
Published: (2023)

Mobile-GS: Real-time Gaussian Splatting for Mobile Devices
by: Du, Xiaobiao, et al.
Published: (2026)

MobileIE: An Extremely Lightweight and Effective ConvNet for Real-Time Image Enhancement on Mobile Devices
by: Yan, Hailong, et al.
Published: (2025)

Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration
by: Wang, Junyang, et al.
Published: (2024)

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception
by: Wang, Junyang, et al.
Published: (2024)

AMC: AutoML for Model Compression and Acceleration on Mobile Devices
by: He, Yihui, et al.
Published: (2018)

StreamDiT: Real-Time Streaming Text-to-Video Generation
by: Kodaira, Akio, et al.
Published: (2025)

LightPure: Realtime Adversarial Image Purification for Mobile Devices Using Diffusion Models
by: Khalili, Hossein, et al.
Published: (2024)

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
by: Wu, Yushu, et al.
Published: (2024)

Instant 3D Human Avatar Generation using Image Diffusion Models
by: Kolotouros, Nikos, et al.
Published: (2024)

Towards Lightest Low-Light Image Enhancement Architecture for Mobile Devices
by: Bai, Guangrui, et al.
Published: (2025)

Rethinking Diffusion-Based Image Generators for Fundus Fluorescein Angiography Synthesis on Limited Data
by: Yu, Chengzhou, et al.
Published: (2024)

InstantIR: Blind Image Restoration with Instant Generative Reference
by: Huang, Jen-Yuan, et al.
Published: (2024)

Neodragon: Mobile Video Generation using Diffusion Transformer
by: Karnewar, Animesh, et al.
Published: (2025)

FontAdapter: Instant Font Adaptation in Visual Text Generation
by: Koo, Myungkyu, et al.
Published: (2025)

MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices
by: Jiang, Jianwen, et al.
Published: (2024)

S2DiT: Sandwich Diffusion Transformer for Mobile Streaming Video Generation
by: Zhao, Lin, et al.
Published: (2026)

Mobile Augmented Reality Framework with Fusional Localization and Pose Estimation
by: Hou, Songlin, et al.
Published: (2025)

MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation
by: Li, Mingcheng, et al.
Published: (2025)

Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
by: Zou, Siyu, et al.
Published: (2024)

Mobile Video Diffusion
by: Yahia, Haitam Ben, et al.
Published: (2024)

Now You See It, Now You Don't - Instant Concept Erasure for Safe Text-to-Image and Video Generation
by: Biswas, Shristi Das, et al.
Published: (2025)

Populate-A-Scene: Affordance-Aware Human Video Generation
by: Shan, Mengyi, et al.
Published: (2025)

InstantEdit: Text-Guided Few-Step Image Editing with Piecewise Rectified Flow
by: Gong, Yiming, et al.
Published: (2025)

Efficient Neural Light Fields (ENeLF) for Mobile Devices
by: Peng, Austin
Published: (2024)

Porting Large Language Models to Mobile Devices for Question Answering
by: Fassold, Hannes
Published: (2024)

Towards Real-time Video Compressive Sensing on Mobile Devices
by: Cao, Miao, et al.
Published: (2024)