Guardado en:
Detalles Bibliográficos
Autores principales: Chen, Dong, Wei, Zizhuang, Xu, Jialei, Sun, Xinyang, He, Zonglin, An, Meiru, Peng, Huili, Hu, Yong, Cheung, Kenneth MC
Formato: Preprint
Publicado: 2026
Materias:
Acceso en línea:https://arxiv.org/abs/2602.06743
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866914311100694528
author Chen, Dong
Wei, Zizhuang
Xu, Jialei
Sun, Xinyang
He, Zonglin
An, Meiru
Peng, Huili
Hu, Yong
Cheung, Kenneth MC
author_facet Chen, Dong
Wei, Zizhuang
Xu, Jialei
Sun, Xinyang
He, Zonglin
An, Meiru
Peng, Huili
Hu, Yong
Cheung, Kenneth MC
contents Adolescent Idiopathic Scoliosis (AIS) is a prevalent spinal deformity whose progression can be mitigated through early detection. Conventional screening methods are often subjective, difficult to scale, and reliant on specialized clinical expertise. Video-based gait analysis offers a promising alternative, but current datasets and methods frequently suffer from data leakage, where performance is inflated by repeated clips from the same individual, or employ oversimplified models that lack clinical interpretability. To address these limitations, we introduce ScoliGait, a new benchmark dataset comprising 1,572 gait video clips for training and 300 fully independent clips for testing. Each clip is annotated with radiographic Cobb angles and descriptive text based on clinical kinematic priors. We propose a multi-modal framework that integrates a clinical-prior-guided kinematic knowledge map for interpretable feature representation, alongside a latent attention pooling mechanism to fuse video, text, and knowledge map modalities. Our method establishes a new state-of-the-art, demonstrating a significant performance gap on a realistic, non-repeating subject benchmark. Our approach establishes a new state of the art, showing a significant performance gain on a realistic, subject-independent benchmark. This work provides a robust, interpretable, and clinically grounded foundation for scalable, non-invasive AIS assessment.
format Preprint
id arxiv_https___arxiv_org_abs_2602_06743
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Clinical-Prior Guided Multi-Modal Learning with Latent Attention Pooling for Gait-Based Scoliosis Screening
Chen, Dong
Wei, Zizhuang
Xu, Jialei
Sun, Xinyang
He, Zonglin
An, Meiru
Peng, Huili
Hu, Yong
Cheung, Kenneth MC
Computer Vision and Pattern Recognition
Adolescent Idiopathic Scoliosis (AIS) is a prevalent spinal deformity whose progression can be mitigated through early detection. Conventional screening methods are often subjective, difficult to scale, and reliant on specialized clinical expertise. Video-based gait analysis offers a promising alternative, but current datasets and methods frequently suffer from data leakage, where performance is inflated by repeated clips from the same individual, or employ oversimplified models that lack clinical interpretability. To address these limitations, we introduce ScoliGait, a new benchmark dataset comprising 1,572 gait video clips for training and 300 fully independent clips for testing. Each clip is annotated with radiographic Cobb angles and descriptive text based on clinical kinematic priors. We propose a multi-modal framework that integrates a clinical-prior-guided kinematic knowledge map for interpretable feature representation, alongside a latent attention pooling mechanism to fuse video, text, and knowledge map modalities. Our method establishes a new state-of-the-art, demonstrating a significant performance gap on a realistic, non-repeating subject benchmark. Our approach establishes a new state of the art, showing a significant performance gain on a realistic, subject-independent benchmark. This work provides a robust, interpretable, and clinically grounded foundation for scalable, non-invasive AIS assessment.
title Clinical-Prior Guided Multi-Modal Learning with Latent Attention Pooling for Gait-Based Scoliosis Screening
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2602.06743