Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Le, Thanh Binh, Vo, Hoang Nhat Khang, Mai, Tan-Ha, Phan, Trong Nhan
Formato:	Preprint
Publicado:	2025
Materias:	Computer Vision and Pattern Recognition Artificial Intelligence
Acceso en línea:	https://arxiv.org/abs/2509.20813
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866908558355857408
author	Le, Thanh Binh Vo, Hoang Nhat Khang Mai, Tan-Ha Phan, Trong Nhan
author_facet	Le, Thanh Binh Vo, Hoang Nhat Khang Mai, Tan-Ha Phan, Trong Nhan
contents	Low back pain affects millions worldwide, driving the need for robust diagnostic models that can jointly analyze complex medical images and accompanying text reports. We present LumbarCLIP, a novel multimodal framework that leverages contrastive language-image pretraining to align lumbar spine MRI scans with corresponding radiological descriptions. Built upon a curated dataset containing axial MRI views paired with expert-written reports, LumbarCLIP integrates vision encoders (ResNet-50, Vision Transformer, Swin Transformer) with a BERT-based text encoder to extract dense representations. These are projected into a shared embedding space via learnable projection heads, configurable as linear or non-linear, and normalized to facilitate stable contrastive training using a soft CLIP loss. Our model achieves state-of-the-art performance on downstream classification, reaching up to 95.00% accuracy and 94.75% F1-score on the test set, despite inherent class imbalance. Extensive ablation studies demonstrate that linear projection heads yield more effective cross-modal alignment than non-linear variants. LumbarCLIP offers a promising foundation for automated musculoskeletal diagnosis and clinical decision support.
format	Preprint
id	arxiv_https___arxiv_org_abs_2509_20813
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Revolutionizing Precise Low Back Pain Diagnosis via Contrastive Learning Le, Thanh Binh Vo, Hoang Nhat Khang Mai, Tan-Ha Phan, Trong Nhan Computer Vision and Pattern Recognition Artificial Intelligence Low back pain affects millions worldwide, driving the need for robust diagnostic models that can jointly analyze complex medical images and accompanying text reports. We present LumbarCLIP, a novel multimodal framework that leverages contrastive language-image pretraining to align lumbar spine MRI scans with corresponding radiological descriptions. Built upon a curated dataset containing axial MRI views paired with expert-written reports, LumbarCLIP integrates vision encoders (ResNet-50, Vision Transformer, Swin Transformer) with a BERT-based text encoder to extract dense representations. These are projected into a shared embedding space via learnable projection heads, configurable as linear or non-linear, and normalized to facilitate stable contrastive training using a soft CLIP loss. Our model achieves state-of-the-art performance on downstream classification, reaching up to 95.00% accuracy and 94.75% F1-score on the test set, despite inherent class imbalance. Extensive ablation studies demonstrate that linear projection heads yield more effective cross-modal alignment than non-linear variants. LumbarCLIP offers a promising foundation for automated musculoskeletal diagnosis and clinical decision support.
title	Revolutionizing Precise Low Back Pain Diagnosis via Contrastive Learning
topic	Computer Vision and Pattern Recognition Artificial Intelligence
url	https://arxiv.org/abs/2509.20813

Ejemplares similares