Guardado en:
| Autores principales: | Menn, Dennis, Liang, Feng, Marculescu, Diana |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2509.19589 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Similarity Trajectories: Linking Sampling Process to Artifacts in Diffusion-Generated Images
por: Menn, Dennis, et al.
Publicado: (2024)
por: Menn, Dennis, et al.
Publicado: (2024)
Video Compression Meets Video Generation: Latent Inter-Frame Pruning with Attention Recovery
por: Menn, Dennis, et al.
Publicado: (2026)
por: Menn, Dennis, et al.
Publicado: (2026)
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
por: Liang, Feng, et al.
Publicado: (2022)
por: Liang, Feng, et al.
Publicado: (2022)
Latent Inter-Frame Pruning: A Training-Free Method Bridging Traditional Video Compression and Modern Diffusion Transformers for Efficient Generation
por: Menn, Dennis, et al.
Publicado: (2026)
por: Menn, Dennis, et al.
Publicado: (2026)
Ada-VE: Training-Free Consistent Video Editing Using Adaptive Motion Prior
por: Mahmud, Tanvir, et al.
Publicado: (2024)
por: Mahmud, Tanvir, et al.
Publicado: (2024)
Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling
por: Frumkin, Natalia, et al.
Publicado: (2025)
por: Frumkin, Natalia, et al.
Publicado: (2025)
Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
por: Frumkin, Natalia, et al.
Publicado: (2023)
por: Frumkin, Natalia, et al.
Publicado: (2023)
Looking Backward: Streaming Video-to-Video Translation with Feature Banks
por: Liang, Feng, et al.
Publicado: (2024)
por: Liang, Feng, et al.
Publicado: (2024)
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference
por: Mahmud, Tanvir, et al.
Publicado: (2024)
por: Mahmud, Tanvir, et al.
Publicado: (2024)
T-VSL: Text-Guided Visual Sound Source Localization in Mixtures
por: Mahmud, Tanvir, et al.
Publicado: (2024)
por: Mahmud, Tanvir, et al.
Publicado: (2024)
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models
por: Wu, Weijia, et al.
Publicado: (2023)
por: Wu, Weijia, et al.
Publicado: (2023)
ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation
por: Athar, Ali, et al.
Publicado: (2024)
por: Athar, Ali, et al.
Publicado: (2024)
MK-UNet: Multi-kernel Lightweight CNN for Medical Image Segmentation
por: Rahman, Md Mostafijur, et al.
Publicado: (2025)
por: Rahman, Md Mostafijur, et al.
Publicado: (2025)
LoMix: Learnable Weighted Multi-Scale Logits Mixing for Medical Image Segmentation
por: Rahman, Md Mostafijur, et al.
Publicado: (2025)
por: Rahman, Md Mostafijur, et al.
Publicado: (2025)
SpecDM: Hyperspectral Dataset Synthesis with Pixel-level Semantic Annotations
por: Liu, Wendi, et al.
Publicado: (2025)
por: Liu, Wendi, et al.
Publicado: (2025)
Prominence-Aware Artifact Detection and Dataset for Image Super-Resolution
por: Molodetskikh, Ivan, et al.
Publicado: (2025)
por: Molodetskikh, Ivan, et al.
Publicado: (2025)
Pixel-level Quality Assessment for Oriented Object Detection
por: Zhu, Yunhui, et al.
Publicado: (2025)
por: Zhu, Yunhui, et al.
Publicado: (2025)
PixelArena: A benchmark for Pixel-Precision Visual Intelligence
por: Liang, Feng, et al.
Publicado: (2025)
por: Liang, Feng, et al.
Publicado: (2025)
Motion Artifact Removal in Pixel-Frequency Domain via Alternate Masks and Diffusion Model
por: Xu, Jiahua, et al.
Publicado: (2024)
por: Xu, Jiahua, et al.
Publicado: (2024)
Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos
por: Tang, Yuqi, et al.
Publicado: (2026)
por: Tang, Yuqi, et al.
Publicado: (2026)
PixelVLA: Advancing Pixel-level Understanding in Vision-Language-Action Model
por: Liang, Wenqi, et al.
Publicado: (2025)
por: Liang, Wenqi, et al.
Publicado: (2025)
IPAD-CLIP: Teaching CLIP to Detect Image Local Perceptual Artifacts
por: Wang, Juan, et al.
Publicado: (2026)
por: Wang, Juan, et al.
Publicado: (2026)
QuarterMap: Efficient Post-Training Token Pruning for Visual State Space Models
por: Chi, Tien-Yu, et al.
Publicado: (2025)
por: Chi, Tien-Yu, et al.
Publicado: (2025)
JPEG AI Image Compression Visual Artifacts: Detection Methods and Dataset
por: Tsereh, Daria, et al.
Publicado: (2024)
por: Tsereh, Daria, et al.
Publicado: (2024)
MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers
por: Mahmud, Tanvir, et al.
Publicado: (2024)
por: Mahmud, Tanvir, et al.
Publicado: (2024)
PixelLM: Pixel Reasoning with Large Multimodal Model
por: Ren, Zhongwei, et al.
Publicado: (2023)
por: Ren, Zhongwei, et al.
Publicado: (2023)
Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding
por: Zhang, Tao, et al.
Publicado: (2025)
por: Zhang, Tao, et al.
Publicado: (2025)
Fuel Gauge: Estimating Chain-of-Thought Length Ahead of Time in Large Multimodal Models
por: Yang, Yuedong, et al.
Publicado: (2026)
por: Yang, Yuedong, et al.
Publicado: (2026)
Synthesize Boundaries: A Boundary-aware Self-consistent Framework for Weakly Supervised Salient Object Detection
por: Xu, Binwei, et al.
Publicado: (2022)
por: Xu, Binwei, et al.
Publicado: (2022)
SynthForge: Synthesizing High-Quality Face Dataset with Controllable 3D Generative Models
por: Rawat, Abhay, et al.
Publicado: (2024)
por: Rawat, Abhay, et al.
Publicado: (2024)
OTR: Synthesizing Overlay Text Dataset for Text Removal
por: Zdenek, Jan, et al.
Publicado: (2025)
por: Zdenek, Jan, et al.
Publicado: (2025)
Simple Visual Artifact Detection in Sora-Generated Videos
por: Sugiyama, Misora, et al.
Publicado: (2025)
por: Sugiyama, Misora, et al.
Publicado: (2025)
Detecting Human Artifacts from Text-to-Image Models
por: Wang, Kaihong, et al.
Publicado: (2024)
por: Wang, Kaihong, et al.
Publicado: (2024)
GeneVA: A Dataset of Human Annotations for Generative Text to Video Artifacts
por: Kang, Jenna, et al.
Publicado: (2025)
por: Kang, Jenna, et al.
Publicado: (2025)
LEHA-CVQAD: Dataset To Enable Generalized Video Quality Assessment of Compression Artifacts
por: Gushchin, Aleksandr, et al.
Publicado: (2025)
por: Gushchin, Aleksandr, et al.
Publicado: (2025)
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation
por: Wen, Siwei, et al.
Publicado: (2025)
por: Wen, Siwei, et al.
Publicado: (2025)
Beyond Semantic Features: Pixel-level Mapping for Generalized AI-Generated Image Detection
por: Zhou, Chenming, et al.
Publicado: (2025)
por: Zhou, Chenming, et al.
Publicado: (2025)
Scaling Graph Convolutions for Mobile Vision
por: Avery, William, et al.
Publicado: (2024)
por: Avery, William, et al.
Publicado: (2024)
Online-LoRA: Task-free Online Continual Learning via Low Rank Adaptation
por: Wei, Xiwen, et al.
Publicado: (2024)
por: Wei, Xiwen, et al.
Publicado: (2024)
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis
por: Liang, Feng, et al.
Publicado: (2023)
por: Liang, Feng, et al.
Publicado: (2023)
Ejemplares similares
-
Similarity Trajectories: Linking Sampling Process to Artifacts in Diffusion-Generated Images
por: Menn, Dennis, et al.
Publicado: (2024) -
Video Compression Meets Video Generation: Latent Inter-Frame Pruning with Attention Recovery
por: Menn, Dennis, et al.
Publicado: (2026) -
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
por: Liang, Feng, et al.
Publicado: (2022) -
Latent Inter-Frame Pruning: A Training-Free Method Bridging Traditional Video Compression and Modern Diffusion Transformers for Efficient Generation
por: Menn, Dennis, et al.
Publicado: (2026) -
Ada-VE: Training-Free Consistent Video Editing Using Adaptive Motion Prior
por: Mahmud, Tanvir, et al.
Publicado: (2024)