:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gkikas, Stefanos, Tsiknakis, Manolis
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2407.19811
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Twins-PainViT: Towards a Modality-Agnostic Vision Transformer Framework for Multimodal Automatic Pain Assessment using Facial Videos and fNIRS
by: Gkikas, Stefanos, et al.
Published: (2024)

A Full Transformer-based Framework for Automatic Pain Estimation using Videos
by: Gkikas, Stefanos, et al.
Published: (2024)

PainFormer: a Vision Foundation Model for Automatic Pain Assessment
by: Gkikas, Stefanos, et al.
Published: (2025)

Multi-task Neural Networks for Pain Intensity Estimation using Electrocardiogram and Demographic Factors
by: Gkikas, Stefanos, et al.
Published: (2024)

Multi-Representation Diagrams for Pain Recognition: Integrating Various Electrodermal Activity Signals into a Single Image
by: Gkikas, Stefanos, et al.
Published: (2025)

Efficient Pain Recognition via Respiration Signals: A Single Cross-Attention Transformer Multi-Window Fusion Pipeline
by: Gkikas, Stefanos, et al.
Published: (2025)

Tiny-BioMoE: a Lightweight Embedding Model for Biosignal Analysis
by: Gkikas, Stefanos, et al.
Published: (2025)

A Lightweight Transformer for Pain Recognition from Brain Activity
by: Gkikas, Stefanos, et al.
Published: (2026)

A Pain Assessment Framework based on multimodal data and Deep Machine Learning methods
by: Gkikas, Stefanos
Published: (2025)

GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation
by: Li, Wenhao, et al.
Published: (2022)

MM-UNet: A Mixed MLP Architecture for Improved Ophthalmic Image Segmentation
by: Xiao, Zunjie, et al.
Published: (2024)

Bridging KAN and MLP: MJKAN, a Hybrid Architecture with Both Efficiency and Expressiveness
by: Joo, Hanseon, et al.
Published: (2025)

SD-VSum: A Method and Dataset for Script-Driven Video Summarization
by: Mylonas, Manolis, et al.
Published: (2025)

Learning to Find Missing Video Frames with Synthetic Data Augmentation: A General Framework and Application in Generating Thermal Images Using RGB Cameras
by: Andersen, Mathias Viborg, et al.
Published: (2024)

MRI-Based Brain Tumor Detection through an Explainable EfficientNetV2 and MLP-Mixer-Attention Architecture
by: Yurdakul, Mustafa, et al.
Published: (2025)

Segment Any RGB-Thermal Model with Language-aided Distillation
by: Xing, Dong, et al.
Published: (2025)

A Deep Learning approach for Depressive Symptoms assessment in Parkinson's disease patients using facial videos
by: Kyprakis, Ioannis, et al.
Published: (2025)

Learning to Predict Aboveground Biomass from RGB Images with 3D Synthetic Scenes
by: Zuffi, Silvia
Published: (2025)

RGB-Th-Bench: A Dense benchmark for Visual-Thermal Understanding of Vision Language Models
by: Moshtaghi, Mehdi, et al.
Published: (2025)

Complementary Random Masking for RGB-Thermal Semantic Segmentation
by: Shin, Ukcheol, et al.
Published: (2023)

SpiralMLP: A Lightweight Vision MLP Architecture
by: Mu, Haojie, et al.
Published: (2024)

Heterogeneous Graph Transformer for Multiple Tiny Object Tracking in RGB-T Videos
by: Xu, Qingyu, et al.
Published: (2024)

Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
by: Wang, Xiao, et al.
Published: (2023)

Video Text Preservation with Synthetic Text-Rich Videos
by: Liu, Ziyang, et al.
Published: (2025)

Randomized-MLP Regularization Improves Domain Adaptation and Interpretability in DINOv2
by: Ortega, Joel Valdivia, et al.
Published: (2025)

Vision-Language Models for Ergonomic Assessment of Manual Lifting Tasks: Estimating Horizontal and Vertical Hand Distances from RGB Video
by: Rajabi, Mohammad Sadra, et al.
Published: (2026)

evMLP: An Efficient Event-Driven MLP Architecture for Vision
by: Zheng, Zhentan
Published: (2025)

Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation
by: Jin, Youngwan, et al.
Published: (2024)

SPACT18: Spiking Human Action Recognition Benchmark Dataset with Complementary RGB and Thermal Modalities
by: Ashraf, Yasser, et al.
Published: (2025)

Unsupervised Training of Vision Transformers with Synthetic Negatives
by: Giakoumoglou, Nikolaos, et al.
Published: (2025)

PreSem-Surf: RGB-D Surface Reconstruction with Progressive Semantic Modeling and SG-MLP Pre-Rendering Mechanism
by: Ye, Yuyan, et al.
Published: (2025)

Depth Any Video with Scalable Synthetic Data
by: Yang, Honghui, et al.
Published: (2024)

TerraQ: Spatiotemporal Question-Answering on Satellite Image Archives
by: Kefalidis, Sergios-Anestis, et al.
Published: (2025)

Fake & Square: Training Self-Supervised Vision Transformers with Synthetic Data and Synthetic Hard Negatives
by: Giakoumoglou, Nikolaos, et al.
Published: (2025)

MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions
by: Yan, Sheng, et al.
Published: (2024)

A Survey on Mamba Architecture for Vision Applications
by: Ibrahim, Fady, et al.
Published: (2025)

Enhancing the Safety of Medical Vision-Language Models by Synthetic Demonstrations
by: Xue, Zhiyu, et al.
Published: (2025)

VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
by: He, Xuan, et al.
Published: (2024)

D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection
by: Do, Dinh Phat, et al.
Published: (2024)

Synthetic Human Action Video Data Generation with Pose Transfer
by: Knapp, Vaclav, et al.
Published: (2025)