Saved in:
| Main Authors: | Song, Juan, Yang, Lijie, Feng, Mingtao |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.00399 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
High Frequency Matters: Uncertainty Guided Image Compression with Wavelet Diffusion
by: Song, Juan, et al.
Published: (2024)
by: Song, Juan, et al.
Published: (2024)
Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs
by: Liu, Jinming, et al.
Published: (2024)
by: Liu, Jinming, et al.
Published: (2024)
LMM-driven Semantic Image-Text Coding for Ultra Low-bitrate Learned Image Compression
by: Murai, Shimon, et al.
Published: (2024)
by: Murai, Shimon, et al.
Published: (2024)
Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer
by: Xue, Naifu, et al.
Published: (2024)
by: Xue, Naifu, et al.
Published: (2024)
FLaTEC: Frequency-Disentangled Latent Triplanes for Efficient Compression of LiDAR Point Clouds
by: Zhang, Xiaoge, et al.
Published: (2025)
by: Zhang, Xiaoge, et al.
Published: (2025)
SLIM: Semantic-based Low-bitrate Image compression for Machines by leveraging diffusion
by: Lee, Hyeonjin, et al.
Published: (2025)
by: Lee, Hyeonjin, et al.
Published: (2025)
Towards image compression with perfect realism at ultra-low bitrates
by: Careil, Marlène, et al.
Published: (2023)
by: Careil, Marlène, et al.
Published: (2023)
Fine color guidance in diffusion models and its application to image compression at extremely low bitrates
by: Bordin, Tom, et al.
Published: (2024)
by: Bordin, Tom, et al.
Published: (2024)
Perception Without Engagement: Dissecting the Causal Discovery Deficit in LMMs
by: Liang, Jiafeng, et al.
Published: (2026)
by: Liang, Jiafeng, et al.
Published: (2026)
Teaching LMMs for Image Quality Scoring and Interpreting
by: Zhang, Zicheng, et al.
Published: (2025)
by: Zhang, Zicheng, et al.
Published: (2025)
All-in-One Transferring Image Compression from Human Perception to Multi-Machine Perception
by: Zhao, Jiancheng, et al.
Published: (2025)
by: Zhao, Jiancheng, et al.
Published: (2025)
VisualCritic: Making LMMs Perceive Visual Quality Like Humans
by: Huang, Zhipeng, et al.
Published: (2024)
by: Huang, Zhipeng, et al.
Published: (2024)
LMME3DHF: Benchmarking and Evaluating Multimodal 3D Human Face Generation with LMMs
by: Yang, Woo Yi, et al.
Published: (2025)
by: Yang, Woo Yi, et al.
Published: (2025)
Human vs. LMMs: Exploring the Discrepancy in Emoji Interpretation and Usage in Digital Communication
by: Lyu, Hanjia, et al.
Published: (2024)
by: Lyu, Hanjia, et al.
Published: (2024)
LMM4Edit: Benchmarking and Evaluating Multimodal Image Editing with LMMs
by: Xu, Zitong, et al.
Published: (2025)
by: Xu, Zitong, et al.
Published: (2025)
Language-Guided Visual Perception Disentanglement for Image Quality Assessment and Conditional Image Generation
by: Yang, Zhichao, et al.
Published: (2025)
by: Yang, Zhichao, et al.
Published: (2025)
MMGenBench: Fully Automatically Evaluating LMMs from the Text-to-Image Generation Perspective
by: Huang, Hailang, et al.
Published: (2024)
by: Huang, Hailang, et al.
Published: (2024)
LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs
by: Wang, Jiarui, et al.
Published: (2025)
by: Wang, Jiarui, et al.
Published: (2025)
Distributed Image Compression with Multimodal Side Information at Extremely Low Bitrates
by: Xu, Guojun, et al.
Published: (2026)
by: Xu, Guojun, et al.
Published: (2026)
HiSem: Hierarchical Semantic Disentangling for Remote Sensing Image Change Captioning
by: Wang, Man, et al.
Published: (2026)
by: Wang, Man, et al.
Published: (2026)
MIBench: Evaluating LMMs on Multimodal Interaction
by: Miao, Yu, et al.
Published: (2026)
by: Miao, Yu, et al.
Published: (2026)
A Framework for Generating Semantically Ambiguous Images to Probe Human and Machine Perception
by: Hu, Yuqi, et al.
Published: (2026)
by: Hu, Yuqi, et al.
Published: (2026)
ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs
by: Xie, Yin, et al.
Published: (2024)
by: Xie, Yin, et al.
Published: (2024)
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
by: Zhang, Zicheng, et al.
Published: (2024)
by: Zhang, Zicheng, et al.
Published: (2024)
Learning to Wander: Improving the Global Image Geolocation Ability of LMMs via Actionable Reasoning
by: Zheng, Yushuo, et al.
Published: (2026)
by: Zheng, Yushuo, et al.
Published: (2026)
Hierarchical Semantic Compression for Consistent Image Semantic Restoration
by: Li, Shengxi, et al.
Published: (2025)
by: Li, Shengxi, et al.
Published: (2025)
Efficiently Disentangling CLIP for Multi-Object Perception
by: Rawlekar, Samyak, et al.
Published: (2025)
by: Rawlekar, Samyak, et al.
Published: (2025)
Disentangled Human Body Representation Based on Unsupervised Semantic-Aware Learning
by: Wang, Lu, et al.
Published: (2025)
by: Wang, Lu, et al.
Published: (2025)
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation
by: Li, Hongxiang, et al.
Published: (2024)
by: Li, Hongxiang, et al.
Published: (2024)
MMSearch-R1: Incentivizing LMMs to Search
by: Wu, Jinming, et al.
Published: (2025)
by: Wu, Jinming, et al.
Published: (2025)
Visually-Guided Controllable Medical Image Generation via Fine-Grained Semantic Disentanglement
by: Huang, Xin, et al.
Published: (2026)
by: Huang, Xin, et al.
Published: (2026)
Disco4D: Disentangled 4D Human Generation and Animation from a Single Image
by: Pang, Hui En, et al.
Published: (2024)
by: Pang, Hui En, et al.
Published: (2024)
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models
by: Zhang, Kaichen, et al.
Published: (2024)
by: Zhang, Kaichen, et al.
Published: (2024)
Map-Assisted Remote-Sensing Image Compression at Extremely Low Bitrates
by: Ye, Yixuan, et al.
Published: (2024)
by: Ye, Yixuan, et al.
Published: (2024)
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs
by: Meng, Lingchen, et al.
Published: (2024)
by: Meng, Lingchen, et al.
Published: (2024)
Infrared and Visible Image Fusion with Hierarchical Human Perception
by: Yang, Guang, et al.
Published: (2024)
by: Yang, Guang, et al.
Published: (2024)
Noise Dimension of GAN: An Image Compression Perspective
by: Zhu, Ziran, et al.
Published: (2024)
by: Zhu, Ziran, et al.
Published: (2024)
UniCoRN: Unified Commented Retrieval Network with LMMs
by: Jaritz, Maximilian, et al.
Published: (2025)
by: Jaritz, Maximilian, et al.
Published: (2025)
Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs
by: Zhang, Zicheng, et al.
Published: (2024)
by: Zhang, Zicheng, et al.
Published: (2024)
VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs
by: Bharadwaj, Rohit, et al.
Published: (2024)
by: Bharadwaj, Rohit, et al.
Published: (2024)
Similar Items
-
High Frequency Matters: Uncertainty Guided Image Compression with Wavelet Diffusion
by: Song, Juan, et al.
Published: (2024) -
Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs
by: Liu, Jinming, et al.
Published: (2024) -
LMM-driven Semantic Image-Text Coding for Ultra Low-bitrate Learned Image Compression
by: Murai, Shimon, et al.
Published: (2024) -
Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer
by: Xue, Naifu, et al.
Published: (2024) -
FLaTEC: Frequency-Disentangled Latent Triplanes for Efficient Compression of LiDAR Point Clouds
by: Zhang, Xiaoge, et al.
Published: (2025)