:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Sreelatha, Silpa Vadakkeeveetil, Nag, Sauradip, Awais, Muhammad, Belongie, Serge, Dutta, Anjan
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Computer Vision and Pattern Recognition Machine Learning
Accesso online:	https://arxiv.org/abs/2509.15257
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

RAIGen: Rare Attribute Identification in Text-to-Image Generative Models
di: Sreelatha, Silpa Vadakkeeveetil, et al.
Pubblicazione: (2026)

DeNetDM: Debiasing by Network Depth Modulation
di: Sreelatha, Silpa Vadakkeeveetil, et al.
Pubblicazione: (2024)

OmniCount: Multi-label Object Counting with Semantic-Geometric Priors
di: Mondal, Anindya, et al.
Pubblicazione: (2024)

CountLoop: Training-Free High-Instance Image Generation via Iterative Agent Guidance
di: Mondal, Anindya, et al.
Pubblicazione: (2025)

Actor-agnostic Multi-label Action Recognition with Multi-modal Query
di: Mondal, Anindya, et al.
Pubblicazione: (2023)

In-2-4D: Inbetweening from Two Single-View Images to 4D Generation
di: Nag, Sauradip, et al.
Pubblicazione: (2025)

Articulate That Object Part (ATOP): 3D Part Articulation via Text and Motion Personalization
di: Vora, Aditya, et al.
Pubblicazione: (2025)

ASIA: Adaptive 3D Segmentation using Few Image Annotations
di: Perla, Sai Raj Kishore, et al.
Pubblicazione: (2025)

MMEarth-Bench: Global Model Adaptation via Multimodal Test-Time Training
di: Gordon, Lucia, et al.
Pubblicazione: (2026)

SMITE: Segment Me In TimE
di: Alimohammadi, Amirhossein, et al.
Pubblicazione: (2024)

PhysConvex: Physics-Informed 3D Dynamic Convex Radiance Fields for Reconstruction and Simulation
di: Wang, Dan, et al.
Pubblicazione: (2026)

Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning
di: Deria, Ankan, et al.
Pubblicazione: (2025)

HiddenObjects: Scalable Diffusion-Distilled Spatial Priors for Object Placement
di: Schouten, Marco, et al.
Pubblicazione: (2026)

Labeled Data Selection for Category Discovery
di: Zhao, Bingchen, et al.
Pubblicazione: (2024)

FaithDiff: Unleashing Diffusion Priors for Faithful Image Super-resolution
di: Chen, Junyang, et al.
Pubblicazione: (2024)

Stitch: Training-Free Position Control in Multimodal Diffusion Transformers
di: Bader, Jessica, et al.
Pubblicazione: (2025)

Unlearning-based Neural Interpretations
di: Choi, Ching Lam, et al.
Pubblicazione: (2024)

TaleDiffusion: Multi-Character Story Generation with Dialogue Rendering
di: Banerjee, Ayan, et al.
Pubblicazione: (2025)

Benchmarking Large Vision-Language Models on Fine-Grained Image Tasks: A Comprehensive Evaluation
di: Yu, Hong-Tao, et al.
Pubblicazione: (2025)

POEM: Precise Object-level Editing via MLLM control
di: Schouten, Marco, et al.
Pubblicazione: (2025)

Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
di: An, Zhaochong, et al.
Pubblicazione: (2025)

Advances in 4D Representation: Geometry, Motion, and Interaction
di: Zhao, Mingrui, et al.
Pubblicazione: (2025)

DC-ViT: Modulating Spatial and Channel Interactions for Multi-Channel Images
di: Marikkar, Umar, et al.
Pubblicazione: (2026)

Noise-Coded Illumination for Forensic and Photometric Video Analysis
di: Michael, Peter F., et al.
Pubblicazione: (2025)

Towards Faithful Multimodal Concept Bottleneck Models
di: Moreau, Pierre, et al.
Pubblicazione: (2026)

Familiarity-Based Open-Set Recognition Under Adversarial Attacks
di: Enevoldsen, Philip, et al.
Pubblicazione: (2023)

Learning Conditional Invariances through Non-Commutativity
di: Chaudhuri, Abhra, et al.
Pubblicazione: (2024)

VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward
di: An, Zhaochong, et al.
Pubblicazione: (2026)

CLIPDraw++: Text-to-Sketch Synthesis with Simple Primitives
di: Mathur, Nityanand, et al.
Pubblicazione: (2023)

DeltaDiff: Reality-Driven Diffusion with AnchorResiduals for Faithful SR
di: Yang, Chao, et al.
Pubblicazione: (2025)

Better Language Models Exhibit Higher Visual Alignment
di: Ruthardt, Jona, et al.
Pubblicazione: (2024)

Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
di: Pach, Mateusz, et al.
Pubblicazione: (2025)

The Latent Color Subspace: Emergent Order in High-Dimensional Chaos
di: Pach, Mateusz, et al.
Pubblicazione: (2026)

SuperF: Neural Implicit Fields for Multi-Image Super-Resolution
di: Jyhne, Sander Riisøen, et al.
Pubblicazione: (2025)

Taxonomy-Aware Evaluation of Vision-Language Models
di: Snæbjarnarson, Vésteinn, et al.
Pubblicazione: (2025)

Revisiting the Perception-Distortion Trade-off with Spatial-Semantic Guided Super-Resolution
di: Wang, Dan, et al.
Pubblicazione: (2026)

Assessing Neural Network Robustness via Adversarial Pivotal Tuning
di: Christensen, Peter Ebert, et al.
Pubblicazione: (2022)

Domain Adaptation Without the Compute Burden for Efficient Whole Slide Image Analysis
di: Marikkar, Umar, et al.
Pubblicazione: (2026)

Cora: Correspondence-aware image editing using few step diffusion
di: Alimohammadi, Amirhossein, et al.
Pubblicazione: (2025)

On the Faithfulness of Vision Transformer Explanations
di: Wu, Junyi, et al.
Pubblicazione: (2024)