:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Chaoyun, Shen, I-Chao, Igarashi, Takeo, Jiang, Caigui
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2507.15000
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Cascaded Robust Rectification for Arbitrary Document Images
by: Wang, Chaoyun, et al.
Published: (2025)

Real-Time Per-Garment Virtual Try-On with Temporal Consistency for Loose-Fitting Garments
by: Wu, Zaiqiang, et al.
Published: (2025)

D2Dewarp: Dual Dimensions Geometric Representation Learning Based Document Image Dewarping
by: Li, Heng, et al.
Published: (2025)

TADoc: Robust Time-Aware Document Image Dewarping
by: Zhao, Fangmin, et al.
Published: (2025)

Garment Particles: A 2D--3D Symmetric Garment Representation for Generation and Editing
by: Nakayama, Kiyohiro, et al.
Published: (2026)

DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinates-based Diffusion Model
by: Zhang, Weiguang, et al.
Published: (2025)

FontCLIP: A Semantic Typography Visual-Language Model for Multilingual Font Applications
by: Tatsukawa, Yuki, et al.
Published: (2024)

PAS3R: Pose-Adaptive Streaming 3D Reconstruction for Long Video Sequences
by: Xu, Lanbo, et al.
Published: (2026)

Dropping Anchor and Spherical Harmonics for Sparse-view Gaussian Splatting
by: Fang, Shuangkang, et al.
Published: (2026)

MicroGlam: Microscopic Skin Image Dataset with Cosmetics
by: Chong, Toby, et al.
Published: (2023)

NeRF Is a Valuable Assistant for 3D Gaussian Splatting
by: Fang, Shuangkang, et al.
Published: (2025)

Low-Barrier Dataset Collection with Real Human Body for Interactive Per-Garment Virtual Try-On
by: Wu, Zaiqiang, et al.
Published: (2025)

A Compact Hybrid Convolution--Frequency State Space Network for Learned Image Compression
by: Pan, Haodong, et al.
Published: (2025)

Efficient Document Image Dewarping via Hybrid Deep Learning and Cubic Polynomial Geometry Restoration
by: Istomin, Valery, et al.
Published: (2025)

MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh
by: Fang, Shuangkang, et al.
Published: (2025)

iComMa: Inverting 3D Gaussian Splatting for Camera Pose Estimation via Comparing and Matching
by: Sun, Yuan, et al.
Published: (2023)

FunHOI: Annotation-Free 3D Hand-Object Interaction Generation via Functional Text Guidanc
by: Tian, Yongqi, et al.
Published: (2025)

Automatic Dance Video Segmentation for Understanding Choreography
by: Endo, Koki, et al.
Published: (2024)

Axis-Aligned 3D Stalk Diameter Estimation from RGB-D Imagery
by: Vail, Benjamin, et al.
Published: (2025)

Lung Nodule Image Synthesis Driven by Two-Stage Generative Adversarial Networks
by: Cao, Lu, et al.
Published: (2026)

AlignHuman: Improving Motion and Fidelity via Timestep-Segment Preference Optimization for Audio-Driven Human Animation
by: Liang, Chao, et al.
Published: (2025)

U-REPA: Aligning Diffusion U-Nets to ViTs
by: Tian, Yuchuan, et al.
Published: (2025)

HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-Resolution
by: Chu, Shu-Chuan, et al.
Published: (2024)

Wasserstein-Aligned Hyperbolic Multi-View Clustering
by: Wang, Rui, et al.
Published: (2025)

PPJudge: Towards Human-Aligned Assessment of Artistic Painting Process
by: Jiang, Shiqi, et al.
Published: (2025)

BATON: A Multimodal Benchmark for Bidirectional Automation Transition Observation in Naturalistic Driving
by: Wang, Yuhang, et al.
Published: (2026)

Orientation Matters: Making 3D Generative Models Orientation-Aligned
by: Lu, Yichong, et al.
Published: (2025)

AlignedGen: Aligning Style Across Generated Images
by: Zhang, Jiexuan, et al.
Published: (2025)

Nodule-Aligned Latent Space Learning with LLM-Driven Multimodal Diffusion for Lung Nodule Progression Prediction
by: Song, James, et al.
Published: (2026)

H-OmniStereo: Zero-Shot Omnidirectional Stereo Matching with Heading-Aligned Normal Priors
by: Jiang, Chenxing, et al.
Published: (2026)

Natural Human Motion Recovery by Aligning High-Order Temporal Dynamics from Monocular Videos
by: Wei, Dingkun, et al.
Published: (2026)

Follow-Your-Preference: Towards Preference-Aligned Image Inpainting
by: Shen, Yutao, et al.
Published: (2025)

Omni-Judge: Can Omni-LLMs Serve as Human-Aligned Judges for Text-Conditioned Audio-Video Generation?
by: Liang, Susan, et al.
Published: (2026)

Align-DETR: Enhancing End-to-end Object Detection with Aligned Loss
by: Cai, Zhi, et al.
Published: (2023)

Facial beauty prediction fusing transfer learning and broad learning system
by: Gan, Junying, et al.
Published: (2026)

DocCogito: Aligning Layout Cognition and Step-Level Grounded Reasoning for Document Understanding
by: Wu, Yuchuan, et al.
Published: (2026)

PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter
by: Xiao, Junfei, et al.
Published: (2024)

MTGA: Multi-View Temporal Granularity Aligned Aggregation for Event-Based Lip-Reading
by: Zhang, Wenhao, et al.
Published: (2024)

Geo-Align: Video Generation Alignment via Metric Geometry Reward
by: Li, Zizun, et al.
Published: (2026)

Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
by: Lu, Jiahao, et al.
Published: (2024)