Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.01408 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914962641780736 |
|---|---|
| author | Wang, Jun Mao, Yu Guan, Nan Xue, Chun Jason |
| author_facet | Wang, Jun Mao, Yu Guan, Nan Xue, Chun Jason |
| contents | The multimodal model has demonstrated promise in histopathology. However, most multimodal models are based on H\&E and genomics, adopting increasingly complex yet black-box designs. In our paper, we propose a novel interpretable multimodal framework named SHAP-CAT, which uses a Shapley-value-based dimension reduction technique for effective multimodal fusion. Starting with two paired modalities -- H\&E and IHC images, we employ virtual staining techniques to enhance limited input data by generating a new clinical-related modality. Lightweight bag-level representations are extracted from image modalities and a Shapley-value-based mechanism is used for dimension reduction. For each dimension of the bag-level representation, attribution values are calculated to indicate how changes in the specific dimensions of the input affect the model output. In this way, we select a few top important dimensions of bag-level representation for each image modality to late fusion. Our experimental results demonstrate that the proposed SHAP-CAT framework incorporating synthetic modalities significantly enhances model performance, yielding a 5\% increase in accuracy for the BCI, an 8\% increase for IHC4BC-ER, and an 11\% increase for the IHC4BC-PR dataset. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2410_01408 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | SHAP-CAT: A interpretable multi-modal framework enhancing WSI classification via virtual staining and shapley-value-based multimodal fusion Wang, Jun Mao, Yu Guan, Nan Xue, Chun Jason Computer Vision and Pattern Recognition The multimodal model has demonstrated promise in histopathology. However, most multimodal models are based on H\&E and genomics, adopting increasingly complex yet black-box designs. In our paper, we propose a novel interpretable multimodal framework named SHAP-CAT, which uses a Shapley-value-based dimension reduction technique for effective multimodal fusion. Starting with two paired modalities -- H\&E and IHC images, we employ virtual staining techniques to enhance limited input data by generating a new clinical-related modality. Lightweight bag-level representations are extracted from image modalities and a Shapley-value-based mechanism is used for dimension reduction. For each dimension of the bag-level representation, attribution values are calculated to indicate how changes in the specific dimensions of the input affect the model output. In this way, we select a few top important dimensions of bag-level representation for each image modality to late fusion. Our experimental results demonstrate that the proposed SHAP-CAT framework incorporating synthetic modalities significantly enhances model performance, yielding a 5\% increase in accuracy for the BCI, an 8\% increase for IHC4BC-ER, and an 11\% increase for the IHC4BC-PR dataset. |
| title | SHAP-CAT: A interpretable multi-modal framework enhancing WSI classification via virtual staining and shapley-value-based multimodal fusion |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2410.01408 |