:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yan, Jiebin, Wu, Lei, Fang, Yuming, Liu, Xuelin, Xia, Xue, Liu, Weide
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2501.07087
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Subjective and Objective Quality Assessment of Non-Uniformly Distorted Omnidirectional Images
by: Yan, Jiebin, et al.
Published: (2025)

Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Unified and Generalized Approach
by: Yan, Jiebin, et al.
Published: (2026)

2AFC Prompting of Large Multimodal Models for Image Quality Assessment
by: Zhu, Hanwei, et al.
Published: (2024)

Multitask Auxiliary Network for Perceptual Quality Assessment of Non-Uniformly Distorted Omnidirectional Images
by: Yan, Jiebin, et al.
Published: (2025)

Causal Disentanglement-Inspired Degradation Representation Learning for Full-Reference Image Quality Assessment
by: Zhang, Zhen, et al.
Published: (2026)

Diffusion-based Facial Aesthetics Enhancement with 3D Structure Guidance
by: Li, Lisha, et al.
Published: (2025)

PanFoMa: A Lightweight Foundation Model and Benchmark for Pan-Cancer
by: Huang, Xiaoshui, et al.
Published: (2025)

Computational Analysis of Degradation Modeling in Blind Panoramic Image Quality Assessment
by: Yan, Jiebin, et al.
Published: (2025)

Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Flexible and Effective Paradigm
by: Yan, Jiebin, et al.
Published: (2025)

Uncertainty Awareness on Unsupervised Domain Adaptation for Time Series Data
by: Liu, Weide, et al.
Published: (2025)

Max360IQ: Blind Omnidirectional Image Quality Assessment with Multi-axis Attention
by: Yan, Jiebin, et al.
Published: (2025)

Scaling-up Perceptual Video Quality Assessment
by: Jia, Ziheng, et al.
Published: (2025)

Draft-and-Target Sampling for Video Generation Policy
by: Zhang, Qikang, et al.
Published: (2026)

SweetTok: Semantic-Aware Spatial-Temporal Tokenizer for Compact Video Discretization
by: Tan, Zhentao, et al.
Published: (2024)

Rethinking Weakly-supervised Video Temporal Grounding From a Game Perspective
by: Fang, Xiang, et al.
Published: (2026)

VQA$^2$: Visual Question Answering for Video Quality Assessment
by: Jia, Ziheng, et al.
Published: (2024)

Democratizing High-Fidelity Co-Speech Gesture Video Generation
by: Yang, Xu, et al.
Published: (2025)

One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution
by: Sun, Yujing, et al.
Published: (2025)

Moment-Video: Diagnosing Temporal Fidelity of Video MLLMs on Momentary Visual Events
by: Liu, Xiaolin, et al.
Published: (2026)

STAA: Spatio-Temporal Attention Attribution for Real-Time Interpreting Transformer-based Video Models
by: Wang, Zerui, et al.
Published: (2024)

A Paradigm Shift: Fully End-to-End Training for Temporal Sentence Grounding in Videos
by: He, Allen, et al.
Published: (2026)

Fast Adversarial Training with Weak-to-Strong Spatial-Temporal Consistency in the Frequency Domain on Videos
by: Wang, Songping, et al.
Published: (2025)

Spatia: Video Generation with Updatable Spatial Memory
by: Zhao, Jinjing, et al.
Published: (2025)

FastInit: Fast Noise Initialization for Temporally Consistent Video Generation
by: Bai, Chengyu, et al.
Published: (2025)

VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding
by: Wang, Shihao, et al.
Published: (2025)

PathoSyn: Imaging-Pathology MRI Synthesis via Disentangled Deviation Diffusion
by: Wang, Jian, et al.
Published: (2025)

SurgLLM: A Versatile Large Multimodal Model with Spatial Focus and Temporal Awareness for Surgical Video Understanding
by: Chen, Zhen, et al.
Published: (2025)

The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
by: Gong, Sitong, et al.
Published: (2025)

Online Handwritten Signature Verification Based on Temporal-Spatial Graph Attention Transformer
by: Yuan, Hai-jie, et al.
Published: (2025)

Meta-Point Learning and Refining for Category-Agnostic Pose Estimation
by: Chen, Junjie, et al.
Published: (2024)

TimeScope: Towards Task-Oriented Temporal Grounding In Long Videos
by: Liu, Xiangrui, et al.
Published: (2025)

Mask-RadarNet: Enhancing Transformer With Spatial-Temporal Semantic Context for Radar Object Detection in Autonomous Driving
by: Wu, Yuzhi, et al.
Published: (2024)

Learning from Online Videos at Inference Time for Computer-Use Agents
by: Liu, Yujian, et al.
Published: (2025)

Learning Local and Global Temporal Contexts for Video Semantic Segmentation
by: Sun, Guolei, et al.
Published: (2022)

SpatialPoint: Spatial-aware Point Prediction for Embodied Localization
by: Zhu, Qiming, et al.
Published: (2026)

StarVid: Enhancing Semantic Alignment in Video Diffusion Models via Spatial and SynTactic Guided Attention Refocusing
by: Li, Yuanhang, et al.
Published: (2024)

ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation
by: Zhang, Mengchen, et al.
Published: (2025)

DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement
by: Wu, Hao, et al.
Published: (2024)

RelightVid: Temporal-Consistent Diffusion Model for Video Relighting
by: Fang, Ye, et al.
Published: (2025)

LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models
by: Ge, Qihang, et al.
Published: (2024)