:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Premi, Santosh
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2605.17165
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Delta-Based Neural Architecture Search: LLM Fine-Tuning via Code Diffs
by: Adhikari, Santosh Premi, et al.
Published: (2026)

BRo-JEPA: Learning Modular Arithmetic in Latent Space
by: Jha, Divyansh, et al.
Published: (2026)

Rethinking JEPA: Compute-Efficient Video SSL with Frozen Teachers
by: Li, Xianhang, et al.
Published: (2025)

Text-Conditional JEPA for Learning Semantically Rich Visual Representations
by: Huang, Chen, et al.
Published: (2026)

KerJEPA: Kernel Discrepancies for Euclidean Self-Supervised Learning
by: Zimmermann, Eric, et al.
Published: (2025)

DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic Latent Particles
by: Daniel, Tal, et al.
Published: (2023)

V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
by: Assran, Mido, et al.
Published: (2025)

Rectified LpJEPA: Joint-Embedding Predictive Architectures with Sparse and Maximum-Entropy Representations
by: Kuang, Yilun, et al.
Published: (2026)

From Image to Video: An Empirical Study of Diffusion Representations
by: Vélez, Pedro, et al.
Published: (2025)

Beyond Generative Priors: Minority Sampling with JEPA-Guided Diffusion
by: Park, Sol, et al.
Published: (2026)

CLARGA: Multimodal Graph Representation Learning over Arbitrary Sets of Modalities
by: Patapati, Santosh
Published: (2025)

Auxiliary Gene Learning: Spatial Gene Expression Estimation by Auxiliary Gene Selection
by: Shiku, Kaito, et al.
Published: (2025)

US-JEPA: A Joint Embedding Predictive Architecture for Medical Ultrasound
by: Radhachandran, Ashwath, et al.
Published: (2026)

LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics
by: Balestriero, Randall, et al.
Published: (2025)

seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World Models
by: Ghaemi, Hafez, et al.
Published: (2025)

Corruption-Aware Training of Latent Video Diffusion Models for Robust Text-to-Video Generation
by: Maduabuchi, Chika, et al.
Published: (2025)

An Empirical Study of World Model Quantization
by: Fu, Zhongqian, et al.
Published: (2026)

VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning
by: Lin, Han, et al.
Published: (2024)

Online Monitoring Framework for Automotive Time Series Data using JEPA Embeddings
by: Fertig, Alexander, et al.
Published: (2026)

UR-JEPA: Uniform Rectifiability as a Regularizer for Joint-Embedding Predictive Architectures
by: Le, Triet M.
Published: (2026)

Contrastive Learning with Auxiliary User Detection for Identifying Activities
by: Ge, Wen, et al.
Published: (2024)

LD-ViCE: Latent Diffusion Model for Video Counterfactual Explanations
by: Varshney, Payal, et al.
Published: (2025)

Latent Uncertainty Representations for Video-based Driver Action and Intention Recognition
by: Vellenga, Koen, et al.
Published: (2025)

HiT-JEPA: A Hierarchical Self-supervised Trajectory Embedding Framework for Similarity Computation
by: Li, Lihuan, et al.
Published: (2025)

Latent Geometry of Taste: Scalable Low-Rank Matrix Factorization for Recommender Systems
by: Salako, Joshua
Published: (2026)

Memorization in 3D Shape Generation: An Empirical Study
by: Pu, Shu, et al.
Published: (2025)

Classes Are Not Equal: An Empirical Study on Image Recognition Fairness
by: Cui, Jiequan, et al.
Published: (2024)

MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
by: Yu, Sihyun, et al.
Published: (2025)

Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition
by: Yu, Sihyun, et al.
Published: (2024)

VideoOrion: Tokenizing Object Dynamics in Videos
by: Feng, Yicheng, et al.
Published: (2024)

Latent Action Pretraining from Videos
by: Ye, Seonghyeon, et al.
Published: (2024)

Diagnostic Benchmarks for Invariant Learning Dynamics: Empirical Validation of the Eidos Architecture
by: Anderson, Datorien L.
Published: (2026)

Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head
by: Yang, Penghui, et al.
Published: (2024)

An Empirical Study Into What Matters for Calibrating Vision-Language Models
by: Tu, Weijie, et al.
Published: (2024)

LLVD: LSTM-based Explicit Motion Modeling in Latent Space for Blind Video Denoising
by: Rashid, Loay, et al.
Published: (2025)

Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics
by: Hoang, Christopher, et al.
Published: (2025)

Enhancing Vector Quantization with Distributional Matching: A Theoretical and Empirical Study
by: Fang, Xianghong, et al.
Published: (2025)

An Empirical Study of Accuracy-Robustness Tradeoff and Training Efficiency in Self-Supervised Learning
by: Ghofrani, Fatemeh, et al.
Published: (2025)

Disentanglement of Biological and Technical Factors via Latent Space Rotation in Clinical Imaging Improves Disease Pattern Discovery
by: Pan, Jeanny, et al.
Published: (2025)

Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation
by: Cao, Yang, et al.
Published: (2025)