Saved in:
| Main Authors: | Garosi, Marco, Tedoldi, Riccardo, Boscaini, Davide, Mancini, Massimiliano, Sebe, Nicu, Poiesi, Fabio |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.04247 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Accurate and efficient zero-shot 6D pose estimation with frozen foundation models
by: Caraffa, Andrea, et al.
Published: (2025)
by: Caraffa, Andrea, et al.
Published: (2025)
Distilling 3D distinctive local descriptors for 6D pose estimation
by: Hamza, Amir, et al.
Published: (2025)
by: Hamza, Amir, et al.
Published: (2025)
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
by: Li, Jinlong, et al.
Published: (2025)
by: Li, Jinlong, et al.
Published: (2025)
FreeZe: Training-free zero-shot 6D pose estimation with geometric and vision foundation models
by: Caraffa, Andrea, et al.
Published: (2023)
by: Caraffa, Andrea, et al.
Published: (2023)
Functionality understanding and segmentation in 3D scenes
by: Corsetti, Jaime, et al.
Published: (2024)
by: Corsetti, Jaime, et al.
Published: (2024)
Generative 6D Pose Estimation via Conditional Flow Matching
by: Hamza, Amir, et al.
Published: (2026)
by: Hamza, Amir, et al.
Published: (2026)
Leveraging Confident Image Regions for Source-Free Domain-Adaptive Object Detection
by: Mekhalfi, Mohamed Lamine, et al.
Published: (2025)
by: Mekhalfi, Mohamed Lamine, et al.
Published: (2025)
Open-vocabulary object 6D pose estimation
by: Corsetti, Jaime, et al.
Published: (2023)
by: Corsetti, Jaime, et al.
Published: (2023)
Fully-Geometric Cross-Attention for Point Cloud Registration
by: Wang, Weijie, et al.
Published: (2025)
by: Wang, Weijie, et al.
Published: (2025)
High-resolution open-vocabulary object 6D pose estimation
by: Corsetti, Jaime, et al.
Published: (2024)
by: Corsetti, Jaime, et al.
Published: (2024)
Geometrically-driven Aggregation for Zero-shot 3D Point Cloud Understanding
by: Mei, Guofeng, et al.
Published: (2023)
by: Mei, Guofeng, et al.
Published: (2023)
Action-guided generation of 3D functionality segmentation data
by: Corsetti, Jaime, et al.
Published: (2025)
by: Corsetti, Jaime, et al.
Published: (2025)
AI-driven visual monitoring of industrial assembly tasks
by: Nardon, Mattia, et al.
Published: (2025)
by: Nardon, Mattia, et al.
Published: (2025)
Large Multimodal Models as General In-Context Classifiers
by: Garosi, Marco, et al.
Published: (2026)
by: Garosi, Marco, et al.
Published: (2026)
Compositional Caching for Training-free Open-vocabulary Attribute Detection
by: Garosi, Marco, et al.
Published: (2025)
by: Garosi, Marco, et al.
Published: (2025)
Self-Supervised and Generalizable Tokenization for CLIP-Based 3D Understanding
by: Mei, Guofeng, et al.
Published: (2025)
by: Mei, Guofeng, et al.
Published: (2025)
Masked Clustering Prediction for Unsupervised Point Cloud Pre-training
by: Ren, Bin, et al.
Published: (2025)
by: Ren, Bin, et al.
Published: (2025)
Revisiting Fully Convolutional Geometric Features for Object 6D Pose Estimation
by: Jaime Corsetti Davide Boscaini Fabio Poiesi
Published: (2026)
by: Jaime Corsetti Davide Boscaini Fabio Poiesi
Published: (2026)
Cues3D: Unleashing the Power of Sole NeRF for Consistent and Unique Instances in Open-Vocabulary 3D Panoptic Segmentation
by: Xue, Feng, et al.
Published: (2025)
by: Xue, Feng, et al.
Published: (2025)
Safe Vision-Language Models via Unsafe Weights Manipulation
by: D'Incà, Moreno, et al.
Published: (2025)
by: D'Incà, Moreno, et al.
Published: (2025)
GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models
by: D'Incà, Moreno, et al.
Published: (2024)
by: D'Incà, Moreno, et al.
Published: (2024)
ZeroReg: Zero-Shot Point Cloud Registration with Foundation Models
by: Wang, Weijie, et al.
Published: (2023)
by: Wang, Weijie, et al.
Published: (2023)
Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant
by: Mei, Guofeng, et al.
Published: (2024)
by: Mei, Guofeng, et al.
Published: (2024)
3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
by: Xu, Xiaoxu, et al.
Published: (2024)
by: Xu, Xiaoxu, et al.
Published: (2024)
LESS: Label-Efficient and Single-Stage Referring 3D Segmentation
by: Liu, Xuexun, et al.
Published: (2024)
by: Liu, Xuexun, et al.
Published: (2024)
Exploring Fine-grained Retail Product Discrimination with Zero-shot Object Classification Using Vision-Language Models
by: Tur, Anil Osman, et al.
Published: (2024)
by: Tur, Anil Osman, et al.
Published: (2024)
RankFeat&RankWeight: Rank-1 Feature/Weight Removal for Out-of-distribution Detection
by: Song, Yue, et al.
Published: (2023)
by: Song, Yue, et al.
Published: (2023)
High-Fidelity 3D Facial Avatar Synthesis with Controllable Fine-Grained Expressions
by: He, Yikang, et al.
Published: (2026)
by: He, Yikang, et al.
Published: (2026)
CHIP: A multi-sensor dataset for 6D pose estimation of chairs in industrial settings
by: Nardon, Mattia, et al.
Published: (2025)
by: Nardon, Mattia, et al.
Published: (2025)
3D Weakly Supervised Semantic Segmentation via Class-Aware and Geometry-Guided Pseudo-Label Refinement
by: Xu, Xiaoxu, et al.
Published: (2025)
by: Xu, Xiaoxu, et al.
Published: (2025)
AlignCAT: Visual-Linguistic Alignment of Category and Attribute for Weakly Supervised Visual Grounding
by: Wang, Yidan, et al.
Published: (2025)
by: Wang, Yidan, et al.
Published: (2025)
PoInit-of-View: Poisoning Initialization of Views Transfers Across Multiple 3D Reconstruction Systems
by: Wang, Weijie, et al.
Published: (2026)
by: Wang, Weijie, et al.
Published: (2026)
CLIP is Strong Enough to Fight Back: Test-time Counterattacks towards Zero-shot Adversarial Robustness of CLIP
by: Xing, Songlong, et al.
Published: (2025)
by: Xing, Songlong, et al.
Published: (2025)
Novel class discovery meets foundation models for 3D semantic segmentation
by: Riz, Luigi, et al.
Published: (2023)
by: Riz, Luigi, et al.
Published: (2023)
Orthogonal Projection Subspace to Aggregate Online Prior-knowledge for Continual Test-time Adaptation
by: Li, Jinlong, et al.
Published: (2025)
by: Li, Jinlong, et al.
Published: (2025)
FreeInsert: Disentangled Text-Guided Object Insertion in 3D Gaussian Scene without Spatial Priors
by: Li, Chenxi, et al.
Published: (2025)
by: Li, Chenxi, et al.
Published: (2025)
Token Reduction via Local and Global Contexts Optimization for Efficient Video Large Language Models
by: Li, Jinlong, et al.
Published: (2026)
by: Li, Jinlong, et al.
Published: (2026)
Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation
by: Dong, Jiahua, et al.
Published: (2025)
by: Dong, Jiahua, et al.
Published: (2025)
6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model
by: Bortolon, Matteo, et al.
Published: (2024)
by: Bortolon, Matteo, et al.
Published: (2024)
Asymmetric GANs for Image-to-Image Translation
by: Tang, Hao, et al.
Published: (2019)
by: Tang, Hao, et al.
Published: (2019)
Similar Items
-
Accurate and efficient zero-shot 6D pose estimation with frozen foundation models
by: Caraffa, Andrea, et al.
Published: (2025) -
Distilling 3D distinctive local descriptors for 6D pose estimation
by: Hamza, Amir, et al.
Published: (2025) -
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
by: Li, Jinlong, et al.
Published: (2025) -
FreeZe: Training-free zero-shot 6D pose estimation with geometric and vision foundation models
by: Caraffa, Andrea, et al.
Published: (2023) -
Functionality understanding and segmentation in 3D scenes
by: Corsetti, Jaime, et al.
Published: (2024)