Saved in:
| Main Authors: | Luo, Yanan, Yi, Jinhui, Farha, Yazan Abu, Wolter, Moritz, Gall, Juergen |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.09431 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation
by: Zatsarynna, Olga, et al.
Published: (2024)
by: Zatsarynna, Olga, et al.
Published: (2024)
MANTA: Diffusion Mamba for Efficient and Effective Stochastic Long-Term Dense Anticipation
by: Zatsarynna, Olga, et al.
Published: (2025)
by: Zatsarynna, Olga, et al.
Published: (2025)
Looking into the Unknown: Exploring Action Discovery for Segmentation of Known and Unknown Actions
by: Spurio, Federico, et al.
Published: (2025)
by: Spurio, Federico, et al.
Published: (2025)
MV-Match: Multi-View Matching for Domain-Adaptive Identification of Plant Nutrient Deficiencies
by: Yi, Jinhui, et al.
Published: (2024)
by: Yi, Jinhui, et al.
Published: (2024)
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models
by: Yi, Jinhui, et al.
Published: (2024)
by: Yi, Jinhui, et al.
Published: (2024)
Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation
by: Veeramacheneni, Lokesh, et al.
Published: (2023)
by: Veeramacheneni, Lokesh, et al.
Published: (2023)
Identifying Spatio-Temporal Drivers of Extreme Events
by: Eddin, Mohamad Hakam Shams, et al.
Published: (2024)
by: Eddin, Mohamad Hakam Shams, et al.
Published: (2024)
LC-SLab -- An Object-based Deep Learning Framework for Large-scale Land Cover Classification from Satellite Imagery and Sparse In-situ Labels
by: Leonhardt, Johannes, et al.
Published: (2025)
by: Leonhardt, Johannes, et al.
Published: (2025)
CamC2V: Context-aware Controllable Video Generation
by: Denninger, Luis, et al.
Published: (2025)
by: Denninger, Luis, et al.
Published: (2025)
Video Panels for Long Video Understanding
by: Doorenbos, Lars, et al.
Published: (2025)
by: Doorenbos, Lars, et al.
Published: (2025)
Using Visual Anomaly Detection for Task Execution Monitoring
by: Thoduka, Santosh, et al.
Published: (2021)
by: Thoduka, Santosh, et al.
Published: (2021)
Learning a Neural Association Network for Self-supervised Multi-Object Tracking
by: Li, Shuai, et al.
Published: (2024)
by: Li, Shuai, et al.
Published: (2024)
ADA-Track++: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association
by: Ding, Shuxiao, et al.
Published: (2024)
by: Ding, Shuxiao, et al.
Published: (2024)
Hierarchical Vector Quantization for Unsupervised Action Segmentation
by: Spurio, Federico, et al.
Published: (2024)
by: Spurio, Federico, et al.
Published: (2024)
StableMamba: Distillation-free Scaling of Large SSMs for Images and Videos
by: Suleman, Hamid, et al.
Published: (2024)
by: Suleman, Hamid, et al.
Published: (2024)
Skeleton Motion Words for Unsupervised Skeleton-Based Temporal Action Segmentation
by: Gökay, Uzay, et al.
Published: (2025)
by: Gökay, Uzay, et al.
Published: (2025)
FlowNar: Scalable Streaming Narration for Long-Form Videos
by: Zhong, Zeyun, et al.
Published: (2026)
by: Zhong, Zeyun, et al.
Published: (2026)
A Survey on Deep Learning Techniques for Action Anticipation
by: Zhong, Zeyun, et al.
Published: (2023)
by: Zhong, Zeyun, et al.
Published: (2023)
Self-Intersection-Aware 3D Human Motion Generation Using an Efficient Human Sphere Proxy
by: Herrmann, Pascal, et al.
Published: (2026)
by: Herrmann, Pascal, et al.
Published: (2026)
A Multimodal Handover Failure Detection Dataset and Baselines
by: Thoduka, Santosh, et al.
Published: (2024)
by: Thoduka, Santosh, et al.
Published: (2024)
Enhancing Video-Based Robot Failure Detection Using Task Knowledge
by: Thoduka, Santosh, et al.
Published: (2025)
by: Thoduka, Santosh, et al.
Published: (2025)
MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action Anticipation
by: Wasim, Syed Talal, et al.
Published: (2025)
by: Wasim, Syed Talal, et al.
Published: (2025)
SyncVP: Joint Diffusion for Synchronous Multi-Modal Video Prediction
by: Pallotta, Enrico, et al.
Published: (2025)
by: Pallotta, Enrico, et al.
Published: (2025)
Privacy-Preserving Semantic Segmentation from Ultra-Low-Resolution RGB Inputs
by: Huang, Xuying, et al.
Published: (2025)
by: Huang, Xuying, et al.
Published: (2025)
Improving action segmentation via explicit similarity measurement
by: Aouaidjia, Kamel, et al.
Published: (2025)
by: Aouaidjia, Kamel, et al.
Published: (2025)
REVEAL: Relation-based Video Representation Learning for Video-Question-Answering
by: Chaybouti, Sofian, et al.
Published: (2025)
by: Chaybouti, Sofian, et al.
Published: (2025)
GroupMamba: Efficient Group-Based Visual State Space Model
by: Shaker, Abdelrahman, et al.
Published: (2024)
by: Shaker, Abdelrahman, et al.
Published: (2024)
STRIVE: Structured Spatiotemporal Exploration for Reinforcement Learning in Video Question Answering
by: Bahrami, Emad, et al.
Published: (2026)
by: Bahrami, Emad, et al.
Published: (2026)
Towards Generalizing Temporal Action Segmentation to Unseen Views
by: Bahrami, Emad, et al.
Published: (2025)
by: Bahrami, Emad, et al.
Published: (2025)
RiverMamba: A State Space Model for Global River Discharge and Flood Forecasting
by: Eddin, Mohamad Hakam Shams, et al.
Published: (2025)
by: Eddin, Mohamad Hakam Shams, et al.
Published: (2025)
Sequence-Adaptive Video Prediction in Continuous Streams using Diffusion Noise Optimization
by: Azar, Sina Mokhtarzadeh, et al.
Published: (2025)
by: Azar, Sina Mokhtarzadeh, et al.
Published: (2025)
EgoControl: Controllable Egocentric Video Generation via 3D Full-Body Poses
by: Pallotta, Enrico, et al.
Published: (2025)
by: Pallotta, Enrico, et al.
Published: (2025)
TFNet: Exploiting Temporal Cues for Fast and Accurate LiDAR Semantic Segmentation
by: Li, Rong, et al.
Published: (2023)
by: Li, Rong, et al.
Published: (2023)
Massively Multi-Person 3D Human Motion Forecasting with Scene Context
by: Mueller, Felix B, et al.
Published: (2024)
by: Mueller, Felix B, et al.
Published: (2024)
Global-Aware Monocular Semantic Scene Completion with State Space Models
by: Li, Shijie, et al.
Published: (2025)
by: Li, Shijie, et al.
Published: (2025)
TQD-Track: Temporal Query Denoising for 3D Multi-Object Tracking
by: Ding, Shuxiao, et al.
Published: (2025)
by: Ding, Shuxiao, et al.
Published: (2025)
Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models
by: Wang, Jifeng, et al.
Published: (2024)
by: Wang, Jifeng, et al.
Published: (2024)
BIKED++: A Multimodal Dataset of 1.4 Million Bicycle Image and Parametric CAD Designs
by: Regenwetter, Lyle, et al.
Published: (2024)
by: Regenwetter, Lyle, et al.
Published: (2024)
Spatio-temporal Decoupled Knowledge Compensator for Few-Shot Action Recognition
by: Qu, Hongyu, et al.
Published: (2026)
by: Qu, Hongyu, et al.
Published: (2026)
TadML: A fast temporal action detection with Mechanics-MLP
by: Deng, Bowen, et al.
Published: (2022)
by: Deng, Bowen, et al.
Published: (2022)
Similar Items
-
Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation
by: Zatsarynna, Olga, et al.
Published: (2024) -
MANTA: Diffusion Mamba for Efficient and Effective Stochastic Long-Term Dense Anticipation
by: Zatsarynna, Olga, et al.
Published: (2025) -
Looking into the Unknown: Exploring Action Discovery for Segmentation of Known and Unknown Actions
by: Spurio, Federico, et al.
Published: (2025) -
MV-Match: Multi-View Matching for Domain-Adaptive Identification of Plant Nutrient Deficiencies
by: Yi, Jinhui, et al.
Published: (2024) -
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models
by: Yi, Jinhui, et al.
Published: (2024)