Saved in:
| Main Authors: | Alishzade, Nigar, Abdullayeva, Gulchin |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.12096 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Comparative Analysis of Recurrent and Attention Architectures for Isolated Sign Language Recognition
by: Alishzade, Nigar, et al.
Published: (2025)
by: Alishzade, Nigar, et al.
Published: (2025)
AzSLD: Azerbaijani Sign Language Dataset for Fingerspelling, Word, and Sentence Translation with Baseline Software
by: Alishzade, Nigar, et al.
Published: (2024)
by: Alishzade, Nigar, et al.
Published: (2024)
GLoT: A Novel Gated-Logarithmic Transformer for Efficient Sign Language Translation
by: Shahin, Nada, et al.
Published: (2025)
by: Shahin, Nada, et al.
Published: (2025)
CLIP-Joint-Detect: End-to-End Joint Training of Object Detectors with Contrastive Vision-Language Supervision
by: Raoufi, Behnam, et al.
Published: (2025)
by: Raoufi, Behnam, et al.
Published: (2025)
ADAT: Time-Series-Aware Adaptive Transformer Architecture for Sign Language Translation
by: Shahin, Nada, et al.
Published: (2025)
by: Shahin, Nada, et al.
Published: (2025)
HATL: Hierarchical Adaptive-Transfer Learning Framework for Sign Language Machine Translation
by: Shahin, Nada, et al.
Published: (2026)
by: Shahin, Nada, et al.
Published: (2026)
Image-Based Leopard Seal Recognition: Approaches and Challenges in Current Automated Systems
by: Salazar, Jorge Yero, et al.
Published: (2024)
by: Salazar, Jorge Yero, et al.
Published: (2024)
Task-Adaptive Semantic Communications with Controllable Diffusion-based Data Regeneration
by: Guo, Fupei, et al.
Published: (2025)
by: Guo, Fupei, et al.
Published: (2025)
ViQAgent: Zero-Shot Video Question Answering via Agent with Open-Vocabulary Grounding Validation
by: Montes, Tony, et al.
Published: (2025)
by: Montes, Tony, et al.
Published: (2025)
OCC-MLLM-CoT-Alpha: Towards Multi-stage Occlusion Recognition Based on Large Language Models via 3D-Aware Supervision and Chain-of-Thoughts Guidance
by: Wang, Chaoyi, et al.
Published: (2025)
by: Wang, Chaoyi, et al.
Published: (2025)
A Multi-purpose Tracking Framework for Salmon Welfare Monitoring in Challenging Environments
by: Høgstedt, Espen Uri, et al.
Published: (2025)
by: Høgstedt, Espen Uri, et al.
Published: (2025)
ALADIN:Attribute-Language Distillation Network for Person Re-Identification
by: Zhou, Wang, et al.
Published: (2026)
by: Zhou, Wang, et al.
Published: (2026)
Mechanisms of Prompt-Induced Hallucination in Vision-Language Models
by: Rudman, William, et al.
Published: (2026)
by: Rudman, William, et al.
Published: (2026)
LISA: Language-guided Interference-aware Spatial-Frequency Attention for Driver Gaze Estimation
by: Ma, Jun, et al.
Published: (2026)
by: Ma, Jun, et al.
Published: (2026)
MoDE: Mixture of Diffusion Experts for Any Occluded Face Recognition
by: Fan, Qiannan, et al.
Published: (2025)
by: Fan, Qiannan, et al.
Published: (2025)
A Vision-Language Model for Focal Liver Lesion Classification
by: Jian, Song, et al.
Published: (2025)
by: Jian, Song, et al.
Published: (2025)
Sign language recognition based on deep learning and low-cost handcrafted descriptors
by: Carneiro, Alvaro Leandro Cavalcante, et al.
Published: (2024)
by: Carneiro, Alvaro Leandro Cavalcante, et al.
Published: (2024)
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language Models
by: Hacheme, Gilles Quentin, et al.
Published: (2025)
by: Hacheme, Gilles Quentin, et al.
Published: (2025)
Prompt Sensitivity in Vision-Language Grounding: How Small Changes in Wording Affect Object Detection
by: Deka, Dawar Jyoti, et al.
Published: (2026)
by: Deka, Dawar Jyoti, et al.
Published: (2026)
Motion-Guided Semantic Alignment with Negative Prompts for Zero-Shot Video Action Recognition
by: Wang, Yiming, et al.
Published: (2026)
by: Wang, Yiming, et al.
Published: (2026)
NOAH: Benchmarking Narrative Prior driven Hallucination and Omission in Video Large Language Models
by: Lee, Kyuho, et al.
Published: (2025)
by: Lee, Kyuho, et al.
Published: (2025)
NeuroGaze-Distill: Brain-informed Distillation and Depression-Inspired Geometric Priors for Robust Facial Emotion Recognition
by: Li, Zilin, et al.
Published: (2025)
by: Li, Zilin, et al.
Published: (2025)
Domain-Adaptive Pretraining Improves Primate Behavior Recognition
by: Mueller, Felix B., et al.
Published: (2025)
by: Mueller, Felix B., et al.
Published: (2025)
Argos: A Decentralized Federated System for Detection of Traffic Signs in CAVs
by: Hossein, Seyed Mahdi Haji Seyed, et al.
Published: (2025)
by: Hossein, Seyed Mahdi Haji Seyed, et al.
Published: (2025)
SVGS-DSGAT: An IoT-Enabled Innovation in Underwater Robotic Object Detection Technology
by: Wu, Dongli, et al.
Published: (2025)
by: Wu, Dongli, et al.
Published: (2025)
Enhancing Spatial Reasoning in Vision-Language Models via Chain-of-Thought Prompting and Reinforcement Learning
by: Ji, Binbin, et al.
Published: (2025)
by: Ji, Binbin, et al.
Published: (2025)
Pedestrian Detection in Low-Light Conditions: A Comprehensive Survey
by: Ghari, Bahareh, et al.
Published: (2024)
by: Ghari, Bahareh, et al.
Published: (2024)
From Dead Pixels to Editable Slides: Infographic Reconstruction into Native Google Slides via Vision-Language Region Understanding
by: Gonzalez, Leonardo
Published: (2026)
by: Gonzalez, Leonardo
Published: (2026)
Two-step Authentication: Multi-biometric System Using Voice and Facial Recognition
by: Chen, Kuan Wei, et al.
Published: (2026)
by: Chen, Kuan Wei, et al.
Published: (2026)
When to Call an Apple Red: Humans Follow Introspective Rules, VLMs Don't
by: Nemitz, Jonathan, et al.
Published: (2026)
by: Nemitz, Jonathan, et al.
Published: (2026)
Transfer-learning for video classification: Video Swin Transformer on multiple domains
by: Oliveira, Daniel A. P., et al.
Published: (2022)
by: Oliveira, Daniel A. P., et al.
Published: (2022)
Adaptive Thresholding for Visual Place Recognition using Negative Gaussian Mixture Statistics
by: Trinh, Nick, et al.
Published: (2025)
by: Trinh, Nick, et al.
Published: (2025)
Attention-Aware Transformer-Based Aggregation Network for Video Periocular Recognition
by: Carreira, Luiz G F, et al.
Published: (2026)
by: Carreira, Luiz G F, et al.
Published: (2026)
RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models
by: Ge, Junyao, et al.
Published: (2024)
by: Ge, Junyao, et al.
Published: (2024)
Low-Cost Tree Crown Dieback Estimation Using Deep Learning-Based Segmentation
by: Allen, M. J., et al.
Published: (2024)
by: Allen, M. J., et al.
Published: (2024)
Physical Knot Classification Beyond Accuracy: A Benchmark and Diagnostic Study
by: Nie, Shiheng, et al.
Published: (2026)
by: Nie, Shiheng, et al.
Published: (2026)
Human-Centric Perception for Child Sexual Abuse Imagery
by: Laranjeira, Camila, et al.
Published: (2026)
by: Laranjeira, Camila, et al.
Published: (2026)
Decoupled Sensitivity-Consistency Learning for Weakly Supervised Video Anomaly Detection
by: Zheng, Hantao, et al.
Published: (2026)
by: Zheng, Hantao, et al.
Published: (2026)
Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic Approach
by: Ke, Yueying
Published: (2025)
by: Ke, Yueying
Published: (2025)
Gaussian Alignment for Relative Camera Pose Estimation via Single-View Reconstruction
by: Li, Yumin, et al.
Published: (2025)
by: Li, Yumin, et al.
Published: (2025)
Similar Items
-
A Comparative Analysis of Recurrent and Attention Architectures for Isolated Sign Language Recognition
by: Alishzade, Nigar, et al.
Published: (2025) -
AzSLD: Azerbaijani Sign Language Dataset for Fingerspelling, Word, and Sentence Translation with Baseline Software
by: Alishzade, Nigar, et al.
Published: (2024) -
GLoT: A Novel Gated-Logarithmic Transformer for Efficient Sign Language Translation
by: Shahin, Nada, et al.
Published: (2025) -
CLIP-Joint-Detect: End-to-End Joint Training of Object Detectors with Contrastive Vision-Language Supervision
by: Raoufi, Behnam, et al.
Published: (2025) -
ADAT: Time-Series-Aware Adaptive Transformer Architecture for Sign Language Translation
by: Shahin, Nada, et al.
Published: (2025)