Saved in:
| Main Authors: | A., Eshwar R., Pal, Debnath |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.14631 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Force Matching with Relativistic Constraints: A Physics-Inspired Approach to Stable and Efficient Generative Modeling
by: Cao, Yang, et al.
Published: (2025)
by: Cao, Yang, et al.
Published: (2025)
ActionParty: Multi-Subject Action Binding in Generative Video Games
by: Pondaven, Alexander, et al.
Published: (2026)
by: Pondaven, Alexander, et al.
Published: (2026)
Generative Image as Action Models
by: Shridhar, Mohit, et al.
Published: (2024)
by: Shridhar, Mohit, et al.
Published: (2024)
Zero-Shot Action Generalization with Limited Observations
by: Alchihabi, Abdullah, et al.
Published: (2025)
by: Alchihabi, Abdullah, et al.
Published: (2025)
Task-conditioned Ensemble of Expert Models for Continuous Learning
by: Sharma, Renu, et al.
Published: (2025)
by: Sharma, Renu, et al.
Published: (2025)
Towards Generalizing Temporal Action Segmentation to Unseen Views
by: Bahrami, Emad, et al.
Published: (2025)
by: Bahrami, Emad, et al.
Published: (2025)
Enhancing Generalization in Vision-Language-Action Models by Preserving Pretrained Representations
by: Grover, Shresth, et al.
Published: (2025)
by: Grover, Shresth, et al.
Published: (2025)
FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Continual Learning
by: Yan, Hongwei, et al.
Published: (2026)
by: Yan, Hongwei, et al.
Published: (2026)
Generative AI in Depth: A Survey of Recent Advances, Model Variants, and Real-World Applications
by: Yazdani, Shamim, et al.
Published: (2025)
by: Yazdani, Shamim, et al.
Published: (2025)
PhiNet v2: A Mask-Free Brain-Inspired Vision Foundation Model from Video
by: Yamada, Makoto, et al.
Published: (2025)
by: Yamada, Makoto, et al.
Published: (2025)
Communication-Inspired Tokenization for Structured Image Representations
by: Davtyan, Aram, et al.
Published: (2026)
by: Davtyan, Aram, et al.
Published: (2026)
MM-PoE: Multiple Choice Reasoning via. Process of Elimination using Multi-Modal Models
by: Chakrabarty, Sayak, et al.
Published: (2024)
by: Chakrabarty, Sayak, et al.
Published: (2024)
Vision-Language Models Unlock Task-Centric Latent Actions
by: Nikulin, Alexander, et al.
Published: (2026)
by: Nikulin, Alexander, et al.
Published: (2026)
Olaf-World: Orienting Latent Actions for Video World Modeling
by: Jiang, Yuxin, et al.
Published: (2026)
by: Jiang, Yuxin, et al.
Published: (2026)
Action-Agnostic Point-Level Supervision for Temporal Action Detection
by: Yoshida, Shuhei M., et al.
Published: (2024)
by: Yoshida, Shuhei M., et al.
Published: (2024)
Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality
by: Luo, Ge Ya, et al.
Published: (2024)
by: Luo, Ge Ya, et al.
Published: (2024)
One-Frame Calibration with Siamese Network in Facial Action Unit Recognition
by: Feng, Shuangquan, et al.
Published: (2024)
by: Feng, Shuangquan, et al.
Published: (2024)
Video Action Differencing
by: Burgess, James, et al.
Published: (2025)
by: Burgess, James, et al.
Published: (2025)
From Spatial to Actions: Grounding Vision-Language-Action Model in Spatial Foundation Priors
by: Zhang, Zhengshen, et al.
Published: (2025)
by: Zhang, Zhengshen, et al.
Published: (2025)
NinA: Normalizing Flows in Action. Training VLA Models with Normalizing Flows
by: Tarasov, Denis, et al.
Published: (2025)
by: Tarasov, Denis, et al.
Published: (2025)
Conformal uncertainty quantification to evaluate predictive fairness of foundation AI model for skin lesion classes across patient demographics
by: Bhattacharyya, Swarnava, et al.
Published: (2025)
by: Bhattacharyya, Swarnava, et al.
Published: (2025)
Cognitive Science-Inspired Evaluation of Core Capabilities for Object Understanding in AI
by: Rutar, Danaja, et al.
Published: (2025)
by: Rutar, Danaja, et al.
Published: (2025)
Semantically Guided Action Anticipation
by: Diko, Anxhelo, et al.
Published: (2024)
by: Diko, Anxhelo, et al.
Published: (2024)
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model
by: Alomar, Khaled, et al.
Published: (2024)
by: Alomar, Khaled, et al.
Published: (2024)
StarFlow: Generating Structured Workflow Outputs From Sketch Images
by: Bechard, Patrice, et al.
Published: (2025)
by: Bechard, Patrice, et al.
Published: (2025)
Hybrid Training for Vision-Language-Action Models
by: Mazzaglia, Pietro, et al.
Published: (2025)
by: Mazzaglia, Pietro, et al.
Published: (2025)
SkelMamba: A State Space Model for Efficient Skeleton Action Recognition of Neurological Disorders
by: Martinel, Niki, et al.
Published: (2024)
by: Martinel, Niki, et al.
Published: (2024)
Skeleton-based Action Recognition with Non-linear Dependency Modeling and Hilbert-Schmidt Independence Criterion
by: Yang, Yuheng
Published: (2024)
by: Yang, Yuheng
Published: (2024)
SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters
by: Jiang, Jianping, et al.
Published: (2024)
by: Jiang, Jianping, et al.
Published: (2024)
A Survey on Efficient Vision-Language-Action Models
by: Yu, Zhaoshu, et al.
Published: (2025)
by: Yu, Zhaoshu, et al.
Published: (2025)
Interactive Post-Training for Vision-Language-Action Models
by: Tan, Shuhan, et al.
Published: (2025)
by: Tan, Shuhan, et al.
Published: (2025)
Compositional Entailment Learning for Hyperbolic Vision-Language Models
by: Pal, Avik, et al.
Published: (2024)
by: Pal, Avik, et al.
Published: (2024)
Feature Hallucination for Self-supervised Action Recognition
by: Wang, Lei, et al.
Published: (2025)
by: Wang, Lei, et al.
Published: (2025)
Evolving Skeletons: Motion Dynamics in Action Recognition
by: Qiu, Jushang, et al.
Published: (2025)
by: Qiu, Jushang, et al.
Published: (2025)
Semantically Guided Representation Learning For Action Anticipation
by: Diko, Anxhelo, et al.
Published: (2024)
by: Diko, Anxhelo, et al.
Published: (2024)
Fly-CL: A Fly-Inspired Framework for Enhancing Efficient Decorrelation and Reduced Training Time in Pre-trained Model-based Continual Representation Learning
by: Zou, Heming, et al.
Published: (2025)
by: Zou, Heming, et al.
Published: (2025)
AdaWorld: Learning Adaptable World Models with Latent Actions
by: Gao, Shenyuan, et al.
Published: (2025)
by: Gao, Shenyuan, et al.
Published: (2025)
Grounding Video Models to Actions through Goal Conditioned Exploration
by: Luo, Yunhao, et al.
Published: (2024)
by: Luo, Yunhao, et al.
Published: (2024)
Latent Action Learning Requires Supervision in the Presence of Distractors
by: Nikulin, Alexander, et al.
Published: (2025)
by: Nikulin, Alexander, et al.
Published: (2025)
VISAGE: Video Synthesis using Action Graphs for Surgery
by: Yeganeh, Yousef, et al.
Published: (2024)
by: Yeganeh, Yousef, et al.
Published: (2024)
Similar Items
-
Force Matching with Relativistic Constraints: A Physics-Inspired Approach to Stable and Efficient Generative Modeling
by: Cao, Yang, et al.
Published: (2025) -
ActionParty: Multi-Subject Action Binding in Generative Video Games
by: Pondaven, Alexander, et al.
Published: (2026) -
Generative Image as Action Models
by: Shridhar, Mohit, et al.
Published: (2024) -
Zero-Shot Action Generalization with Limited Observations
by: Alchihabi, Abdullah, et al.
Published: (2025) -
Task-conditioned Ensemble of Expert Models for Continuous Learning
by: Sharma, Renu, et al.
Published: (2025)