Saved in:
| Main Authors: | Li, Chuntao, Qi, Ruihua, Tang, Chuan, Wei, Jiafu, Yang, Xi, Zhang, Qian, Zhou, Rixin |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.01002 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ShiftedBronzes: Benchmarking and Analysis of Domain Fine-Grained Classification in Open-World Settings
by: Zhou, Rixin, et al.
Published: (2024)
by: Zhou, Rixin, et al.
Published: (2024)
LadderMoE: Ladder-Side Mixture of Experts Adapters for Bronze Inscription Recognition
by: Zhou, Rixin, et al.
Published: (2025)
by: Zhou, Rixin, et al.
Published: (2025)
PairingNet: A Learning-based Pair-searching and -matching Network for Image Fragments
by: Zhou, Rixin, et al.
Published: (2023)
by: Zhou, Rixin, et al.
Published: (2023)
Specializing Large Models for Oracle Bone Script Interpretation via Component-Grounded Multimodal Knowledge Augmentation
by: Zhang, Jianing, et al.
Published: (2026)
by: Zhang, Jianing, et al.
Published: (2026)
LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model
by: Wang, Xi, et al.
Published: (2024)
by: Wang, Xi, et al.
Published: (2024)
FilterPrompt: A Simple yet Efficient Approach to Guide Image Appearance Transfer in Diffusion Models
by: Wang, Xi, et al.
Published: (2024)
by: Wang, Xi, et al.
Published: (2024)
Clustering-based Feature Representation Learning for Oracle Bone Inscriptions Detection
by: Tao, Ye, et al.
Published: (2025)
by: Tao, Ye, et al.
Published: (2025)
CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications
by: Zhang, Tianfang, et al.
Published: (2024)
by: Zhang, Tianfang, et al.
Published: (2024)
HuM-Eval: A Coarse-to-Fine Framework for Human-Centric Video Evaluation
by: Zhang, Bingzi, et al.
Published: (2026)
by: Zhang, Bingzi, et al.
Published: (2026)
A Framework Combining 3D CNN and Transformer for Video-Based Behavior Recognition
by: Zhang, Xiuliang, et al.
Published: (2025)
by: Zhang, Xiuliang, et al.
Published: (2025)
CoDS: Enhancing Collaborative Perception in Heterogeneous Scenarios via Domain Separation
by: Han, Yushan, et al.
Published: (2025)
by: Han, Yushan, et al.
Published: (2025)
ElasticDiT: Efficient Diffusion Transformers via Elastic Architecture and Sparse Attention for High-Resolution Image Generation on Mobile Devices
by: Du, Kunpeng, et al.
Published: (2026)
by: Du, Kunpeng, et al.
Published: (2026)
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots
by: Huang, Ting, et al.
Published: (2025)
by: Huang, Ting, et al.
Published: (2025)
When Trackers Date Fish: A Benchmark and Framework for Underwater Multiple Fish Tracking
by: Li, Weiran, et al.
Published: (2025)
by: Li, Weiran, et al.
Published: (2025)
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
by: Chu, Xiangxiang, et al.
Published: (2023)
by: Chu, Xiangxiang, et al.
Published: (2023)
Medical Referring Image Segmentation via Next-Token Mask Prediction
by: Chen, Xinyu, et al.
Published: (2025)
by: Chen, Xinyu, et al.
Published: (2025)
Two-in-One: Unified Multi-Person Interactive Motion Generation by Latent Diffusion Transformer
by: Li, Boyuan, et al.
Published: (2024)
by: Li, Boyuan, et al.
Published: (2024)
IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning
by: Zhang, Quan, et al.
Published: (2025)
by: Zhang, Quan, et al.
Published: (2025)
Rethinking Pseudo-Label Guided Learning for Weakly Supervised Temporal Action Localization from the Perspective of Noise Correction
by: Zhang, Quan, et al.
Published: (2025)
by: Zhang, Quan, et al.
Published: (2025)
Ancient Script Image Recognition and Processing: A Review
by: Diao, Xiaolei, et al.
Published: (2025)
by: Diao, Xiaolei, et al.
Published: (2025)
PVG: Progressive Vision Graph for Vision Recognition
by: Wu, Jiafu, et al.
Published: (2023)
by: Wu, Jiafu, et al.
Published: (2023)
Photo Dating by Facial Age Aggregation
by: Paplham, Jakub, et al.
Published: (2025)
by: Paplham, Jakub, et al.
Published: (2025)
MambaMIM: Pre-training Mamba with State Space Token Interpolation and its Application to Medical Image Segmentation
by: Tang, Fenghe, et al.
Published: (2024)
by: Tang, Fenghe, et al.
Published: (2024)
Geometric Pooling: maintaining more useful information
by: Xu, Hao, et al.
Published: (2023)
by: Xu, Hao, et al.
Published: (2023)
A Semantic-Enhanced Heterogeneous Graph Learning Method for Flexible Objects Recognition
by: Yang, Kunshan, et al.
Published: (2025)
by: Yang, Kunshan, et al.
Published: (2025)
ArcAid: Analysis of Archaeological Artifacts using Drawings
by: Hayon, Offry, et al.
Published: (2022)
by: Hayon, Offry, et al.
Published: (2022)
PiT: Progressive Diffusion Transformer
by: Wu, Jiafu, et al.
Published: (2025)
by: Wu, Jiafu, et al.
Published: (2025)
AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing
by: Yang, Biao, et al.
Published: (2025)
by: Yang, Biao, et al.
Published: (2025)
M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality Assessment
by: Cui, Chuan, et al.
Published: (2025)
by: Cui, Chuan, et al.
Published: (2025)
Rebalanced Vision-Language Retrieval Considering Structure-Aware Distillation
by: Yang, Yang, et al.
Published: (2024)
by: Yang, Yang, et al.
Published: (2024)
MARRS: Masked Autoregressive Unit-based Reaction Synthesis
by: Wang, Yabiao, et al.
Published: (2025)
by: Wang, Yabiao, et al.
Published: (2025)
MobileI2V: Fast and High-Resolution Image-to-Video on Mobile Devices
by: Zhang, Shuai, et al.
Published: (2025)
by: Zhang, Shuai, et al.
Published: (2025)
Self-Supervised Large Scale Point Cloud Completion for Archaeological Site Restoration
by: Li, Aocheng, et al.
Published: (2025)
by: Li, Aocheng, et al.
Published: (2025)
X-OmniClaw Technical Report: A Unified Mobile Agent for Multimodal Understanding and Interaction
by: Ren, Xiaoming, et al.
Published: (2026)
by: Ren, Xiaoming, et al.
Published: (2026)
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
by: Guan, Kaisi, et al.
Published: (2025)
by: Guan, Kaisi, et al.
Published: (2025)
Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
by: Zhou, Ziheng, et al.
Published: (2024)
by: Zhou, Ziheng, et al.
Published: (2024)
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
by: Wang, Zhenhailong, et al.
Published: (2025)
by: Wang, Zhenhailong, et al.
Published: (2025)
Diachronic Stereo Matching for Multi-Date Satellite Imagery
by: Masquil, Elías, et al.
Published: (2026)
by: Masquil, Elías, et al.
Published: (2026)
MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
by: Mao, Xiaofeng, et al.
Published: (2024)
by: Mao, Xiaofeng, et al.
Published: (2024)
ReTracing: An Archaeological Approach Through Body, Machine, and Generative Systems
by: Wang, Yitong, et al.
Published: (2026)
by: Wang, Yitong, et al.
Published: (2026)
Similar Items
-
ShiftedBronzes: Benchmarking and Analysis of Domain Fine-Grained Classification in Open-World Settings
by: Zhou, Rixin, et al.
Published: (2024) -
LadderMoE: Ladder-Side Mixture of Experts Adapters for Bronze Inscription Recognition
by: Zhou, Rixin, et al.
Published: (2025) -
PairingNet: A Learning-based Pair-searching and -matching Network for Image Fragments
by: Zhou, Rixin, et al.
Published: (2023) -
Specializing Large Models for Oracle Bone Script Interpretation via Component-Grounded Multimodal Knowledge Augmentation
by: Zhang, Jianing, et al.
Published: (2026) -
LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model
by: Wang, Xi, et al.
Published: (2024)