Saved in:
| Main Authors: | Winterbottom, Thomas, Hudson, G. Thomas, Kluvanec, Daniel, Slack, Dean, Sterling, Jamie, Shentu, Junjie, Xiao, Chenghao, Zhou, Zheming, Moubayed, Noura Al |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.17450 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Tricks and Plug-ins for Gradient Boosting in Image Classification
by: Fang, Biyi, et al.
Published: (2025)
by: Fang, Biyi, et al.
Published: (2025)
FT-NCFM: An Influence-Aware Data Distillation Framework for Efficient VLA Models
by: Chen, Kewei, et al.
Published: (2025)
by: Chen, Kewei, et al.
Published: (2025)
ATAAT: Adaptive Threat-Aware Adversarial Tuning Framework against Backdoor Attacks on Vision-Language-Action Models
by: Chen, Kewei, et al.
Published: (2026)
by: Chen, Kewei, et al.
Published: (2026)
VA-$π$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation
by: Liao, Xinyao, et al.
Published: (2025)
by: Liao, Xinyao, et al.
Published: (2025)
Unpacking Hateful Memes: Presupposed Context and False Claims
by: Cai, Weibin, et al.
Published: (2025)
by: Cai, Weibin, et al.
Published: (2025)
CulinaryCut-VLAP: A Vision-Language-Action-Physics Framework for Food Cutting via a Force-Aware Material Point Method
by: Koh, Hyunseo, et al.
Published: (2026)
by: Koh, Hyunseo, et al.
Published: (2026)
Perception-Consistency Multimodal Large Language Models Reasoning via Caption-Regularized Policy Optimization
by: Tu, Songjun, et al.
Published: (2025)
by: Tu, Songjun, et al.
Published: (2025)
Training a Student Expert via Semi-Supervised Foundation Model Distillation
by: Taghavi, Pardis, et al.
Published: (2026)
by: Taghavi, Pardis, et al.
Published: (2026)
SemanticFeels: Semantic Labeling during In-Hand Manipulation
by: Khalil, Anas Al Shikh, et al.
Published: (2026)
by: Khalil, Anas Al Shikh, et al.
Published: (2026)
Physics-informed Variational Autoencoders for Improved Robustness to Environmental Factors of Variation
by: Thoreau, Romain, et al.
Published: (2022)
by: Thoreau, Romain, et al.
Published: (2022)
Predictive Modeling of Maritime Radar Data Using Transformer Architecture
by: Qesaraku, Bjorna, et al.
Published: (2025)
by: Qesaraku, Bjorna, et al.
Published: (2025)
Do Generative Metrics Predict YOLO Performance? An Evaluation Across Models, Augmentation Ratios, and Dataset Complexity
by: Marian, Vasile, et al.
Published: (2026)
by: Marian, Vasile, et al.
Published: (2026)
Short-Window Sliding Learning for Real-Time Violence Detection via LLM-based Auto-Labeling
by: Jung, Seoik, et al.
Published: (2025)
by: Jung, Seoik, et al.
Published: (2025)
Training for X-Ray Vision: Amodal Segmentation, Amodal Content Completion, and View-Invariant Object Representation from Multi-Camera Video
by: Moore, Alexander, et al.
Published: (2025)
by: Moore, Alexander, et al.
Published: (2025)
Cooperative Perception: A Resource-Efficient Framework for Multi-Drone 3D Scene Reconstruction Using Federated Diffusion and NeRF
by: Pourmandi, Massoud
Published: (2025)
by: Pourmandi, Massoud
Published: (2025)
Visible and Hyperspectral Imaging for Quality Assessment of Milk: Property Characterisation and Identification
by: Martinelli, Massimo, et al.
Published: (2026)
by: Martinelli, Massimo, et al.
Published: (2026)
Balanced conic rectified flow
by: Kim, Shin Seong, et al.
Published: (2025)
by: Kim, Shin Seong, et al.
Published: (2025)
AUTHENTICATION: Identifying Rare Failure Modes in Autonomous Vehicle Perception Systems using Adversarially Guided Diffusion Models
by: Zarei, Mohammad, et al.
Published: (2025)
by: Zarei, Mohammad, et al.
Published: (2025)
Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data Fusion
by: Lysyi, Andrii, et al.
Published: (2025)
by: Lysyi, Andrii, et al.
Published: (2025)
Spectral Integrated Gradients for Coarse-to-Fine Feature Attribution
by: Kim, Soyeon, et al.
Published: (2026)
by: Kim, Soyeon, et al.
Published: (2026)
Manifold-Aligned Guided Integrated Gradients for Reliable Feature Attribution
by: Kim, Soyeon, et al.
Published: (2026)
by: Kim, Soyeon, et al.
Published: (2026)
Supervised Learning Has a Necessary Geometric Blind Spot: Theory, Consequences, and Minimal Repair
by: Rajput, Vishal
Published: (2026)
by: Rajput, Vishal
Published: (2026)
Rethinking Visual Intelligence: Insights from Video Pretraining
by: Acuaviva, Pablo, et al.
Published: (2025)
by: Acuaviva, Pablo, et al.
Published: (2025)
Multimodal Generative AI for Story Point Estimation in Software Development
by: Islam, Mohammad Rubyet, et al.
Published: (2025)
by: Islam, Mohammad Rubyet, et al.
Published: (2025)
Contrastive Consolidation of Top-Down Modulations Achieves Sparsely Supervised Continual Learning
by: Tran, Viet Anh Khoa, et al.
Published: (2025)
by: Tran, Viet Anh Khoa, et al.
Published: (2025)
IDOL: Instant Photorealistic 3D Human Creation from a Single Image
by: Zhuang, Yiyu, et al.
Published: (2024)
by: Zhuang, Yiyu, et al.
Published: (2024)
Sat-JEPA-Diff: Bridging Self-Supervised Learning and Generative Diffusion for Remote Sensing
by: Komurcu, Kursat, et al.
Published: (2026)
by: Komurcu, Kursat, et al.
Published: (2026)
A Survey on Vision-Language-Action Models for Embodied AI
by: Ma, Yueen, et al.
Published: (2024)
by: Ma, Yueen, et al.
Published: (2024)
ForAug: Recombining Foregrounds and Backgrounds to Improve Vision Transformer Training with Bias Mitigation
by: Nauen, Tobias Christian, et al.
Published: (2025)
by: Nauen, Tobias Christian, et al.
Published: (2025)
Akasha 2: Hamiltonian State Space Duality and Visual-Language Joint Embedding Predictive Architectur
by: Meziani, Yani
Published: (2026)
by: Meziani, Yani
Published: (2026)
Clarification as Supervision: Reinforcement Learning for Vision-Language Interfaces
by: Gkountouras, John, et al.
Published: (2025)
by: Gkountouras, John, et al.
Published: (2025)
Evaluating Visual Mathematics in Multimodal LLMs: A Multilingual Benchmark Based on the Kangaroo Tests
by: Sáez, Arnau Igualde, et al.
Published: (2025)
by: Sáez, Arnau Igualde, et al.
Published: (2025)
Robust Noise Attenuation via Adaptive Pooling of Transformer Outputs
by: Brothers, Greyson
Published: (2025)
by: Brothers, Greyson
Published: (2025)
Complex Facial Expression Recognition Using Deep Knowledge Distillation of Basic Features
by: Maiden, Angus, et al.
Published: (2023)
by: Maiden, Angus, et al.
Published: (2023)
ExpReS-VLA: Specializing Vision-Language-Action Models Through Experience Replay and Retrieval
by: Syed, Shahram Najam, et al.
Published: (2025)
by: Syed, Shahram Najam, et al.
Published: (2025)
TextTeacher: What Can Language Teach About Images?
by: Nauen, Tobias Christian, et al.
Published: (2026)
by: Nauen, Tobias Christian, et al.
Published: (2026)
TACIT: Transformation-Aware Capturing of Implicit Thought
by: Nobrega, Daniel
Published: (2026)
by: Nobrega, Daniel
Published: (2026)
Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models
by: Popov, Alexander, et al.
Published: (2024)
by: Popov, Alexander, et al.
Published: (2024)
Salient Concept-Aware Generative Data Augmentation
by: Zhao, Tianchen, et al.
Published: (2025)
by: Zhao, Tianchen, et al.
Published: (2025)
Hateful Meme Detection through Context-Sensitive Prompting and Fine-Grained Labeling
by: Ouyang, Rongxin, et al.
Published: (2024)
by: Ouyang, Rongxin, et al.
Published: (2024)
Similar Items
-
Tricks and Plug-ins for Gradient Boosting in Image Classification
by: Fang, Biyi, et al.
Published: (2025) -
FT-NCFM: An Influence-Aware Data Distillation Framework for Efficient VLA Models
by: Chen, Kewei, et al.
Published: (2025) -
ATAAT: Adaptive Threat-Aware Adversarial Tuning Framework against Backdoor Attacks on Vision-Language-Action Models
by: Chen, Kewei, et al.
Published: (2026) -
VA-$π$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation
by: Liao, Xinyao, et al.
Published: (2025) -
Unpacking Hateful Memes: Presupposed Context and False Claims
by: Cai, Weibin, et al.
Published: (2025)