Saved in:
Bibliographic Details
Main Authors: Rachel, Ann, Pawar, Pranav M, Mukharjee, Mithun, M, Raja, Mathew, Tojo
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.16330
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918393538412544
author Rachel, Ann
Pawar, Pranav M
Mukharjee, Mithun
M, Raja
Mathew, Tojo
author_facet Rachel, Ann
Pawar, Pranav M
Mukharjee, Mithun
M, Raja
Mathew, Tojo
contents Lung cancer is a condition where there is abnormal growth of malignant cells that spread in an uncontrollable fashion in the lungs. Some common treatment strategies are surgery, chemotherapy, and radiation which aren't the best options due to the heterogeneous nature of cancer. In personalized medicine, treatments are tailored according to the individual's genetic information along with lifestyle aspects. In addition, AI-based deep learning methods can analyze large sets of data to find early signs of cancer, types of tumor, and prospects of treatment. The paper focuses on the development of personalized treatment plans using specific patient data focusing primarily on the genetic profile. Multi-Omics data from Genomics of Drug Sensitivity in Cancer have been used to build a predictive model along with machine learning techniques. The value of the target variable, LN-IC50, determines how sensitive or resistive a drug is. An XGBoost regressor is utilized to predict the drug response focusing on molecular and cellular features extracted from cancer datasets. Cross-validation and Randomized Search are performed for hyperparameter tuning to further optimize the model's predictive performance. For explanation purposes, SHAP (SHapley Additive exPlanations) was used. SHAP values measure each feature's impact on an individual prediction. Furthermore, interpreting feature relationships was performed using DeepSeek, a large language model trained to verify the biological validity of the features. Contextual explanations regarding the most important genes or pathways were provided by DeepSeek alongside the top SHAP value constituents, supporting the predictability of the model.
format Preprint
id arxiv_https___arxiv_org_abs_2603_16330
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle An Interpretable Machine Learning Framework for Non-Small Cell Lung Cancer Drug Response Analysis
Rachel, Ann
Pawar, Pranav M
Mukharjee, Mithun
M, Raja
Mathew, Tojo
Computer Vision and Pattern Recognition
Artificial Intelligence
Machine Learning
Lung cancer is a condition where there is abnormal growth of malignant cells that spread in an uncontrollable fashion in the lungs. Some common treatment strategies are surgery, chemotherapy, and radiation which aren't the best options due to the heterogeneous nature of cancer. In personalized medicine, treatments are tailored according to the individual's genetic information along with lifestyle aspects. In addition, AI-based deep learning methods can analyze large sets of data to find early signs of cancer, types of tumor, and prospects of treatment. The paper focuses on the development of personalized treatment plans using specific patient data focusing primarily on the genetic profile. Multi-Omics data from Genomics of Drug Sensitivity in Cancer have been used to build a predictive model along with machine learning techniques. The value of the target variable, LN-IC50, determines how sensitive or resistive a drug is. An XGBoost regressor is utilized to predict the drug response focusing on molecular and cellular features extracted from cancer datasets. Cross-validation and Randomized Search are performed for hyperparameter tuning to further optimize the model's predictive performance. For explanation purposes, SHAP (SHapley Additive exPlanations) was used. SHAP values measure each feature's impact on an individual prediction. Furthermore, interpreting feature relationships was performed using DeepSeek, a large language model trained to verify the biological validity of the features. Contextual explanations regarding the most important genes or pathways were provided by DeepSeek alongside the top SHAP value constituents, supporting the predictability of the model.
title An Interpretable Machine Learning Framework for Non-Small Cell Lung Cancer Drug Response Analysis
topic Computer Vision and Pattern Recognition
Artificial Intelligence
Machine Learning
url https://arxiv.org/abs/2603.16330