Saved in:
| Main Author: | Salgado, Alberto G. Rodriguez |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.26839 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
What Makes a Maze Look Like a Maze?
by: Hsu, Joy, et al.
Published: (2024)
by: Hsu, Joy, et al.
Published: (2024)
From Semantics to Pixels: Coarse-to-Fine Masked Autoencoders for Hierarchical Visual Understanding
by: Xiang, Wenzhao, et al.
Published: (2026)
by: Xiang, Wenzhao, et al.
Published: (2026)
Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training
by: Zhang, Jiacheng, et al.
Published: (2024)
by: Zhang, Jiacheng, et al.
Published: (2024)
From Pixels to Components: Eigenvector Masking for Visual Representation Learning
by: Bizeul, Alice, et al.
Published: (2025)
by: Bizeul, Alice, et al.
Published: (2025)
Weakly Supervised Pixel-Level Annotation with Visual Interpretability
by: Nasir, Basma, et al.
Published: (2025)
by: Nasir, Basma, et al.
Published: (2025)
Structure over Pixels: Learning Variable-Length Visual Programs
by: Wyrwiński, Piotr, et al.
Published: (2026)
by: Wyrwiński, Piotr, et al.
Published: (2026)
Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy?
by: Asadi, Nader, et al.
Published: (2024)
by: Asadi, Nader, et al.
Published: (2024)
From Pixels to Patches: Pooling Strategies for Earth Embeddings
by: Corley, Isaac, et al.
Published: (2026)
by: Corley, Isaac, et al.
Published: (2026)
Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models
by: Vasconcelos, Cristina N., et al.
Published: (2024)
by: Vasconcelos, Cristina N., et al.
Published: (2024)
Improving Accuracy and Generalization for Efficient Visual Tracking
by: Zaveri, Ram, et al.
Published: (2024)
by: Zaveri, Ram, et al.
Published: (2024)
From Pixels to Graphs: Deep Graph-Level Anomaly Detection on Dermoscopic Images
by: Xu, Dehn, et al.
Published: (2025)
by: Xu, Dehn, et al.
Published: (2025)
Pixels Versus Priors: Controlling Knowledge Priors in Vision-Language Models through Visual Counterfacts
by: Golovanevsky, Michal, et al.
Published: (2025)
by: Golovanevsky, Michal, et al.
Published: (2025)
The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs
by: Li, Hong, et al.
Published: (2024)
by: Li, Hong, et al.
Published: (2024)
From Pixels to Perception: Interpretable Predictions via Instance-wise Grouped Feature Selection
by: Vandenhirtz, Moritz, et al.
Published: (2025)
by: Vandenhirtz, Moritz, et al.
Published: (2025)
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models
by: NVIDIA, et al.
Published: (2024)
by: NVIDIA, et al.
Published: (2024)
Don't Show Pixels, Show Cues: Unlocking Visual Tool Reasoning in Language Models via Perception Programs
by: Janjua, Muhammad Kamran, et al.
Published: (2026)
by: Janjua, Muhammad Kamran, et al.
Published: (2026)
From Pixels to Prose: A Large Dataset of Dense Image Captions
by: Singla, Vasu, et al.
Published: (2024)
by: Singla, Vasu, et al.
Published: (2024)
From Pixels to Feelings: Aligning MLLMs with Human Cognitive Perception of Images
by: Chen, Yiming, et al.
Published: (2025)
by: Chen, Yiming, et al.
Published: (2025)
Rethinking Generative Image Pretraining: How Far Are We From Scaling Up Next-Pixel Prediction?
by: Yan, Xinchen, et al.
Published: (2025)
by: Yan, Xinchen, et al.
Published: (2025)
Video Prediction of Dynamic Physical Simulations With Pixel-Space Spatiotemporal Transformers
by: Slack, Dean L, et al.
Published: (2025)
by: Slack, Dean L, et al.
Published: (2025)
ImpliHateVid: A Benchmark Dataset and Two-stage Contrastive Learning Framework for Implicit Hate Speech Detection in Videos
by: Rehman, Mohammad Zia Ur, et al.
Published: (2025)
by: Rehman, Mohammad Zia Ur, et al.
Published: (2025)
Bias Redistribution in Visual Machine Unlearning: Does Forgetting One Group Harm Another?
by: Haruna, Yunusa, et al.
Published: (2026)
by: Haruna, Yunusa, et al.
Published: (2026)
Identifying Important Group of Pixels using Interactions
by: Sumiyasu, Kosuke, et al.
Published: (2024)
by: Sumiyasu, Kosuke, et al.
Published: (2024)
Pixels to Prose: Understanding the art of Image Captioning
by: Singh, Hrishikesh, et al.
Published: (2024)
by: Singh, Hrishikesh, et al.
Published: (2024)
From Pixels to Words: Leveraging Explainability in Face Recognition through Interactive Natural Language Processing
by: DeAndres-Tame, Ivan, et al.
Published: (2024)
by: DeAndres-Tame, Ivan, et al.
Published: (2024)
Pix2Cap-COCO: Advancing Visual Comprehension via Pixel-Level Captioning
by: You, Zuyao, et al.
Published: (2025)
by: You, Zuyao, et al.
Published: (2025)
PRIX: Learning to Plan from Raw Pixels for End-to-End Autonomous Driving
by: Wozniak, Maciej K., et al.
Published: (2025)
by: Wozniak, Maciej K., et al.
Published: (2025)
When Does Visual Prompting Outperform Linear Probing for Vision-Language Models? A Likelihood Perspective
by: Tsao, Hsi-Ai, et al.
Published: (2024)
by: Tsao, Hsi-Ai, et al.
Published: (2024)
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
by: Nguyen, Duy-Kien, et al.
Published: (2024)
by: Nguyen, Duy-Kien, et al.
Published: (2024)
Pixel-level Counterfactual Contrastive Learning for Medical Image Segmentation
by: Lafargue-Hauret, Marceau, et al.
Published: (2026)
by: Lafargue-Hauret, Marceau, et al.
Published: (2026)
From Pixels to Prose: Advancing Multi-Modal Language Models for Remote Sensing
by: Sun, Xintian, et al.
Published: (2024)
by: Sun, Xintian, et al.
Published: (2024)
On the Out-of-Distribution Generalization of Reasoning in Multimodal LLMs for Simple Visual Planning Tasks
by: Neuhaus, Yannic, et al.
Published: (2026)
by: Neuhaus, Yannic, et al.
Published: (2026)
From Time-series Generation, Model Selection to Transfer Learning: A Comparative Review of Pixel-wise Approaches for Large-scale Crop Mapping
by: Long, Judy, et al.
Published: (2025)
by: Long, Judy, et al.
Published: (2025)
From pre-training to downstream performance: Does domain-specific pre-training make sense?
by: Krones, Felix
Published: (2026)
by: Krones, Felix
Published: (2026)
Latent Forcing: Reordering the Diffusion Trajectory for Pixel-Space Image Generation
by: Baade, Alan, et al.
Published: (2026)
by: Baade, Alan, et al.
Published: (2026)
FREPix: Frequency-Heterogeneous Flow Matching for Pixel-Space Image Generation
by: Lin, Mingfeng, et al.
Published: (2026)
by: Lin, Mingfeng, et al.
Published: (2026)
Diffusion Model Guided Sampling with Pixel-Wise Aleatoric Uncertainty Estimation
by: De Vita, Michele, et al.
Published: (2024)
by: De Vita, Michele, et al.
Published: (2024)
Improving Out-of-Domain Robustness with Targeted Augmentation in Frequency and Pixel Spaces
by: Wang, Ruoqi, et al.
Published: (2025)
by: Wang, Ruoqi, et al.
Published: (2025)
EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models
by: Namekata, Koichi, et al.
Published: (2024)
by: Namekata, Koichi, et al.
Published: (2024)
CoordFlow: Coordinate Flow for Pixel-wise Neural Video Representation
by: Silver, Daniel, et al.
Published: (2025)
by: Silver, Daniel, et al.
Published: (2025)
Similar Items
-
What Makes a Maze Look Like a Maze?
by: Hsu, Joy, et al.
Published: (2024) -
From Semantics to Pixels: Coarse-to-Fine Masked Autoencoders for Hierarchical Visual Understanding
by: Xiang, Wenzhao, et al.
Published: (2026) -
Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training
by: Zhang, Jiacheng, et al.
Published: (2024) -
From Pixels to Components: Eigenvector Masking for Visual Representation Learning
by: Bizeul, Alice, et al.
Published: (2025) -
Weakly Supervised Pixel-Level Annotation with Visual Interpretability
by: Nasir, Basma, et al.
Published: (2025)