:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Tian, Boyuan, Pang, Yihan, Huzaifa, Muhammad, Wang, Shenlong, Adve, Sarita
Natura:	Preprint
Pubblicazione:	2024
Soggetti:	Computer Vision and Pattern Recognition
Accesso online:	https://arxiv.org/abs/2409.04018
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

EFSA: Episodic Few-Shot Adaptation for Text-to-Image Retrieval
di: Huzaifa, Muhammad, et al.
Pubblicazione: (2024)

CNN and ViT Efficiency Study on Tiny ImageNet and DermaMNIST Datasets
di: Amangeldi, Aidar, et al.
Pubblicazione: (2025)

PhysGen3D: Crafting a Miniature Interactive World from a Single Image
di: Chen, Boyuan, et al.
Pubblicazione: (2025)

Towards Latency-Aware 3D Streaming Perception for Autonomous Driving
di: Peng, Jiaqi, et al.
Pubblicazione: (2025)

Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language Models
di: Imam, Raza, et al.
Pubblicazione: (2024)

AgMMU: A Comprehensive Agricultural Multimodal Understanding Benchmark
di: Gauba, Aruna, et al.
Pubblicazione: (2025)

Vision-Language Navigation with Energy-Based Policy
di: Liu, Rui, et al.
Pubblicazione: (2024)

Multi-Label Out-of-Distribution Detection with Spectral Normalized Joint Energy
di: Mei, Yihan, et al.
Pubblicazione: (2024)

NoPo-Avatar: Generalizable and Animatable Avatars from Sparse Inputs without Human Poses
di: Wen, Jing, et al.
Pubblicazione: (2025)

LIFe-GoM: Generalizable Human Rendering with Learned Iterative Feedback Over Multi-Resolution Gaussians-on-Mesh
di: Wen, Jing, et al.
Pubblicazione: (2025)

EnergyFormer: Energy Attention with Fourier Embedding for Hyperspectral Image Classification
di: Sohail, Saad, et al.
Pubblicazione: (2025)

Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance
di: Imam, Raza, et al.
Pubblicazione: (2024)

Human-like Navigation in a World Built for Humans
di: Chandaka, Bhargav, et al.
Pubblicazione: (2025)

Double-Exponential Increases in Inference Energy: The Cost of the Race for Accuracy
di: Yang, Zeyu, et al.
Pubblicazione: (2024)

Energy-Latency Attacks via Sponge Poisoning
di: Cinà, Antonio Emanuele, et al.
Pubblicazione: (2022)

Towards Reproducible Learning-based Compression
di: Pang, Jiahao, et al.
Pubblicazione: (2024)

Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images
di: Gao, Kuofeng, et al.
Pubblicazione: (2024)

LidarDM: Generative LiDAR Simulation in a Generated World
di: Zyrianov, Vlas, et al.
Pubblicazione: (2024)

Better, Stronger, Faster: Tackling the Trilemma in MLLM-based Segmentation with Simultaneous Textual Mask Prediction
di: Liu, Jiazhen, et al.
Pubblicazione: (2025)

TriAlignXA: An Explainable Trilemma Alignment Framework for Trustworthy Agri-product Grading
di: Xie, Jianfei, et al.
Pubblicazione: (2025)

RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation
di: Cao, Boyuan, et al.
Pubblicazione: (2024)

Energy-Latency Manipulation of Multi-modal Large Language Models via Verbose Samples
di: Gao, Kuofeng, et al.
Pubblicazione: (2024)

LLIA -- Enabling Low-Latency Interactive Avatars: Real-Time Audio-Driven Portrait Video Generation with Diffusion Models
di: Yu, Haojie, et al.
Pubblicazione: (2025)

Micro-Structures Graph-Based Point Cloud Registration for Balancing Efficiency and Accuracy
di: Zhang, Rongling, et al.
Pubblicazione: (2024)

Structure from Duplicates: Neural Inverse Graphics from a Pile of Objects
di: Cheng, Tianhang, et al.
Pubblicazione: (2024)

GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh
di: Wen, Jing, et al.
Pubblicazione: (2024)

MonoPatchNeRF: Improving Neural Radiance Fields with Patch-based Monocular Guidance
di: Wu, Yuqun, et al.
Pubblicazione: (2024)

Plenoptic PNG: Real-Time Neural Radiance Fields in 150 KB
di: Lee, Jae Yong, et al.
Pubblicazione: (2024)

Navigating the Accuracy-Size Trade-Off with Flexible Model Merging
di: Dhasade, Akash, et al.
Pubblicazione: (2025)

WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent Benchmark
di: Yuan, Peng, et al.
Pubblicazione: (2026)

ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes
di: Malik, Hashmat Shadab, et al.
Pubblicazione: (2024)

Enhancing Boundary Segmentation for Topological Accuracy with Skeleton-based Methods
di: Liu, Chuni, et al.
Pubblicazione: (2024)

DISHA: Low-Energy Sparse Transformer at Edge for Outdoor Navigation for the Visually Impaired Individuals
di: Nagil, Praveen, et al.
Pubblicazione: (2024)

Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach
di: Wang, Shiao, et al.
Pubblicazione: (2025)

Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation
di: Hong, Zong-Wei, et al.
Pubblicazione: (2024)

EdgePoint2: Compact Descriptors for Superior Efficiency and Accuracy
di: Yao, Haodi, et al.
Pubblicazione: (2025)

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
di: Wang, Xiaofeng, et al.
Pubblicazione: (2024)

AutoVFX: Physically Realistic Video Editing from Natural Language Instructions
di: Hsu, Hao-Yu, et al.
Pubblicazione: (2024)

Generalizable Sparse-View 3D Reconstruction from Unconstrained Images
di: Gupta, Vinayak, et al.
Pubblicazione: (2026)

AD-GS: Object-Aware B-Spline Gaussian Splatting for Self-Supervised Autonomous Driving
di: Xu, Jiawei, et al.
Pubblicazione: (2025)