:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Paudel, Pramish, Khanal, Anubhav, Chhatkuli, Ajad, Paudel, Danda Pani, Tandukar, Jyoti
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2407.11174
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Continuous Pose for Monocular Cameras in Neural Implicit Representation
by: Ma, Qi, et al.
Published: (2023)

Inferring Compositional 4D Scenes without Ever Seeing One
by: Gokmen, Ahmet Berke, et al.
Published: (2025)

EgoNight: Towards Egocentric Vision Understanding at Night with a Challenging Benchmark
by: Zhang, Deheng, et al.
Published: (2025)

Cross-View Multi-Modal Segmentation @ Ego-Exo4D Challenges 2025
by: Fu, Yuqian, et al.
Published: (2025)

EgoCross: Benchmarking Multimodal Large Language Models for Cross-Domain Egocentric Video Question Answering
by: Li, Yanjun, et al.
Published: (2025)

Learning Generative Interactive Environments By Trained Agent Exploration
by: Kazemi, Naser, et al.
Published: (2024)

EvenNICER-SLAM: Event-based Neural Implicit Encoding SLAM
by: Chen, Shi, et al.
Published: (2024)

From Synchrony to Sequence: Exo-to-Ego Generation via Interpolation
by: Mahdi, Mohammad, et al.
Published: (2026)

ObjectRelator: Enabling Cross-View Object Relation Understanding Across Ego-Centric and Exo-Centric Perspectives
by: Fu, Yuqian, et al.
Published: (2024)

Lego: Learning to Disentangle and Invert Personalized Concepts Beyond Object Appearance in Text-to-Image Diffusion Models
by: Motamed, Saman, et al.
Published: (2023)

EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation
by: Qu, Qiang, et al.
Published: (2025)

Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes
by: Ma, Qi, et al.
Published: (2024)

Vision encoders should be image size agnostic and task driven
by: Prisadnikov, Nedyalko, et al.
Published: (2025)

Self-supervised pretraining for an iterative image size agnostic vision transformer
by: Prisadnikov, Nedyalko, et al.
Published: (2026)

IMPACT: A Dataset for Multi-Granularity Human Procedural Action Understanding in Industrial Assembly
by: Wen, Di, et al.
Published: (2026)

Multi-identity Human Image Animation with Structural Video Diffusion
by: Wang, Zhenzhi, et al.
Published: (2025)

Exo2EgoSyn: Unlocking Foundation Video Generation Models for Exocentric-to-Egocentric Video Synthesis
by: Mahdi, Mohammad, et al.
Published: (2025)

A Simple and Generalist Approach for Panoptic Segmentation
by: Prisadnikov, Nedyalko, et al.
Published: (2024)

RICO: Two Realistic Benchmarks and an In-Depth Analysis for Incremental Learning in Object Detection
by: Neuwirth-Trapp, Matthias, et al.
Published: (2025)

Incremental Object Detection with Prompt-based Methods
by: Neuwirth-Trapp, Matthias, et al.
Published: (2025)

Occam's LGS: An Efficient Approach for Language Gaussian Splatting
by: Cheng, Jiahuan, et al.
Published: (2024)

Zero-Shot Automatic Annotation and Instance Segmentation using LLM-Generated Datasets: Eliminating Field Imaging and Manual Annotation for Deep Learning Model Development
by: Sapkota, Ranjan, et al.
Published: (2024)

Taming CLIP for Fine-grained and Structured Visual Understanding of Museum Exhibits
by: Balauca, Ada-Astrid, et al.
Published: (2024)

StableAnimator: High-Quality Identity-Preserving Human Image Animation
by: Tu, Shuyuan, et al.
Published: (2024)

SeasonScapes: Learning Large-scale Re-lightable 3D Landscapes with Seasonal Variation from Sparse Webcams
by: Kleger, Timo, et al.
Published: (2026)

Ternary-Type Opacity and Hybrid Odometry for RGB NeRF-SLAM
by: Lin, Junru, et al.
Published: (2023)

EverAnimate: Minute-Scale Human Animation via Latent Flow Restoration
by: Li, Wuyang, et al.
Published: (2026)

StableAnimator++: Overcoming Pose Misalignment and Face Distortion for Human Image Animation
by: Tu, Shuyuan, et al.
Published: (2025)

Implicit Preference Alignment for Human Image Animation
by: Wang, Yuanzhi, et al.
Published: (2026)

From Scan to Action: Leveraging Realistic Scans for Embodied Scene Understanding
by: Halacheva, Anna-Maria, et al.
Published: (2025)

Versatile Multimodal Controls for Expressive Talking Human Animation
by: Qin, Zheng, et al.
Published: (2025)

RobustFormer: Noise-Robust Pre-training for images and videos
by: Bastola, Ashish, et al.
Published: (2024)

Efficient Degradation-aware Any Image Restoration
by: Zamfir, Eduard, et al.
Published: (2024)

ReVLA: Reverting Visual Domain Limitation of Robotic Foundation Models
by: Dey, Sombit, et al.
Published: (2024)

InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions
by: Wang, Zhenzhi, et al.
Published: (2025)

Partial CLIP is Enough: Chimera-Seg for Zero-shot Semantic Segmentation
by: Chen, Jialei, et al.
Published: (2025)

Rethinking Global Context in Crowd Counting
by: Sun, Guolei, et al.
Published: (2021)

Reducing Unimodal Bias in Multi-Modal Semantic Segmentation with Multi-Scale Functional Entropy Regularization
by: Zheng, Xu, et al.
Published: (2025)

ConceptPose: Training-Free Zero-Shot Object Pose Estimation using Concept Vectors
by: Kuang, Liming, et al.
Published: (2025)

Exploration-Driven Generative Interactive Environments
by: Savov, Nedko, et al.
Published: (2025)