Saved in:
| Main Author: | Hendria, Willy Fitra |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2306.11341 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LiteGPT: Large Vision-Language Model for Joint Chest X-ray Localization and Classification Task
by: Le-Duc, Khai, et al.
Published: (2024)
by: Le-Duc, Khai, et al.
Published: (2024)
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
by: Henschel, Roberto, et al.
Published: (2024)
by: Henschel, Roberto, et al.
Published: (2024)
CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment
by: Javed, Sajid, et al.
Published: (2024)
by: Javed, Sajid, et al.
Published: (2024)
Deep Video Codec Control for Vision Models
by: Reich, Christoph, et al.
Published: (2023)
by: Reich, Christoph, et al.
Published: (2023)
HPC: Hierarchical Progressive Coding Framework for Volumetric Video
by: Zheng, Zihan, et al.
Published: (2024)
by: Zheng, Zihan, et al.
Published: (2024)
TIACam: Text-Anchored Invariant Feature Learning with Auto-Augmentation for Camera-Robust Zero-Watermarking
by: Tanvir, Abdullah All, et al.
Published: (2026)
by: Tanvir, Abdullah All, et al.
Published: (2026)
Comparing the Robustness of Modern No-Reference Image- and Video-Quality Metrics to Adversarial Attacks
by: Antsiferova, Anastasia, et al.
Published: (2023)
by: Antsiferova, Anastasia, et al.
Published: (2023)
FineVQ: Fine-Grained User Generated Content Video Quality Assessment
by: Duan, Huiyu, et al.
Published: (2024)
by: Duan, Huiyu, et al.
Published: (2024)
Spatial Visibility and Temporal Dynamics: Revolutionizing Field of View Prediction in Adaptive Point Cloud Video Streaming
by: Li, Chen, et al.
Published: (2024)
by: Li, Chen, et al.
Published: (2024)
Video Quality Enhancement Using Deep Learning-Based Prediction Models for Quantized DCT Coefficients in MPEG I-frames
by: Busson, Antonio J G, et al.
Published: (2020)
by: Busson, Antonio J G, et al.
Published: (2020)
HiLight: Technical Report on the Motern AI Video Language Model
by: Wang, Zhiting, et al.
Published: (2024)
by: Wang, Zhiting, et al.
Published: (2024)
A Survey on Super Resolution for video Enhancement Using GAN
by: Maity, Ankush, et al.
Published: (2023)
by: Maity, Ankush, et al.
Published: (2023)
A Near-Raw Talking-Head Video Dataset for Various Computer Vision Tasks
by: Naderi, Babak, et al.
Published: (2026)
by: Naderi, Babak, et al.
Published: (2026)
Self-Supervised Compression and Artifact Correction for Streaming Underwater Imaging Sonar
by: Qian, Rongsheng, et al.
Published: (2025)
by: Qian, Rongsheng, et al.
Published: (2025)
SegCompass: Exploring Interpretable Alignment with Sparse Autoencoders for Enhanced Reasoning Segmentation
by: Lu, Zhenyu, et al.
Published: (2026)
by: Lu, Zhenyu, et al.
Published: (2026)
NAIMA: Semantics Aware RGB Guided Depth Super-Resolution
by: Nasir, Tayyab, et al.
Published: (2026)
by: Nasir, Tayyab, et al.
Published: (2026)
Medical Image Analysis for Detection, Treatment and Planning of Disease using Artificial Intelligence Approaches
by: Yadav, Nand Lal, et al.
Published: (2024)
by: Yadav, Nand Lal, et al.
Published: (2024)
SCENE: Semantic-aware Codec Enhancement with Neural Embeddings
by: Lin, Han-Yu, et al.
Published: (2026)
by: Lin, Han-Yu, et al.
Published: (2026)
Attention GhostUNet++: Enhanced Segmentation of Adipose Tissue and Liver in CT Images
by: Hayat, Mansoor, et al.
Published: (2025)
by: Hayat, Mansoor, et al.
Published: (2025)
DeepFaceLab: Integrated, flexible and extensible face-swapping framework
by: Perov, Ivan, et al.
Published: (2020)
by: Perov, Ivan, et al.
Published: (2020)
CFAT: Unleashing TriangularWindows for Image Super-resolution
by: Ray, Abhisek, et al.
Published: (2024)
by: Ray, Abhisek, et al.
Published: (2024)
Panoramic Image Inpainting With Gated Convolution And Contextual Reconstruction Loss
by: Yu, Li, et al.
Published: (2024)
by: Yu, Li, et al.
Published: (2024)
Benchmarking Conventional and Learned Video Codecs with a Low-Delay Configuration
by: Teng, Siyue, et al.
Published: (2024)
by: Teng, Siyue, et al.
Published: (2024)
Optimizing Multimodal LLMs for Egocentric Video Understanding: A Solution for the HD-EPIC VQA Challenge
by: Yang, Sicheng, et al.
Published: (2026)
by: Yang, Sicheng, et al.
Published: (2026)
LinMU: Multimodal Understanding Made Linear
by: Wang, Hongjie, et al.
Published: (2026)
by: Wang, Hongjie, et al.
Published: (2026)
CAMP-VQA: Caption-Embedded Multimodal Perception for No-Reference Quality Assessment of Compressed Video
by: Wang, Xinyi, et al.
Published: (2025)
by: Wang, Xinyi, et al.
Published: (2025)
Semantic-Aware Adaptive Video Streaming Using Latent Diffusion Models for Wireless Networks
by: Yan, Zijiang, et al.
Published: (2025)
by: Yan, Zijiang, et al.
Published: (2025)
qAttCNN - Self Attention Mechanism for Video QoE Prediction in Encrypted Traffic
by: Sidorov, Michael, et al.
Published: (2026)
by: Sidorov, Michael, et al.
Published: (2026)
Frequency-Spatial Interaction Driven Network for Low-Light Image Enhancement
by: Tao, Yunhong, et al.
Published: (2025)
by: Tao, Yunhong, et al.
Published: (2025)
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios
by: Peng, Kunyu, et al.
Published: (2025)
by: Peng, Kunyu, et al.
Published: (2025)
Perceptual Video Quality Assessment: A Survey
by: Min, Xiongkuo, et al.
Published: (2024)
by: Min, Xiongkuo, et al.
Published: (2024)
ICME 2025 Grand Challenge on Video Super-Resolution for Video Conferencing
by: Naderi, Babak, et al.
Published: (2025)
by: Naderi, Babak, et al.
Published: (2025)
Analysis of Video Quality Datasets via Design of Minimalistic Video Quality Models
by: Sun, Wei, et al.
Published: (2023)
by: Sun, Wei, et al.
Published: (2023)
T2IW: Joint Text to Image & Watermark Generation
by: Liu, An-An, et al.
Published: (2023)
by: Liu, An-An, et al.
Published: (2023)
Learning Perceptual Representations for Gaming NR-VQA with Multi-Task FR Signals
by: Chen, Yu-Chih, et al.
Published: (2026)
by: Chen, Yu-Chih, et al.
Published: (2026)
Safe-VAR: Safe Visual Autoregressive Model for Text-to-Image Generative Watermarking
by: Wang, Ziyi, et al.
Published: (2025)
by: Wang, Ziyi, et al.
Published: (2025)
R-Bench: Are your Large Multimodal Model Robust to Real-world Corruptions?
by: Li, Chunyi, et al.
Published: (2024)
by: Li, Chunyi, et al.
Published: (2024)
Temporal Inconsistency Guidance for Super-resolution Video Quality Assessment
by: Li, Yixiao, et al.
Published: (2024)
by: Li, Yixiao, et al.
Published: (2024)
Scalable Event-Based Video Streaming for Machines with MoQ
by: Freeman, Andrew C.
Published: (2025)
by: Freeman, Andrew C.
Published: (2025)
Object-Attribute-Relation Representation Based Video Semantic Communication
by: Du, Qiyuan, et al.
Published: (2024)
by: Du, Qiyuan, et al.
Published: (2024)
Similar Items
-
LiteGPT: Large Vision-Language Model for Joint Chest X-ray Localization and Classification Task
by: Le-Duc, Khai, et al.
Published: (2024) -
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
by: Henschel, Roberto, et al.
Published: (2024) -
CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment
by: Javed, Sajid, et al.
Published: (2024) -
Deep Video Codec Control for Vision Models
by: Reich, Christoph, et al.
Published: (2023) -
HPC: Hierarchical Progressive Coding Framework for Volumetric Video
by: Zheng, Zihan, et al.
Published: (2024)