Saved in:
Bibliographic Details
Main Authors: Shanmugarasa, Yashothara, Pan, Shidong, Ding, Ming, Zhao, Dehai, Rakotoarivelo, Thierry
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.09961
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • As Large Language Models (LLMs) become integral to scientific workflows, concerns over the confidentiality and ethical handling of confidential data have emerged. This paper explores data exposure risks through LLM-powered scientific tools, which can inadvertently leak confidential information, including intellectual property and proprietary data, from scientists' perspectives. We propose "DataShield", a framework designed to detect confidential data leaks, summarize privacy policies, and visualize data flow, ensuring alignment with organizational policies and procedures. Our approach aims to inform scientists about data handling practices, enabling them to make informed decisions and protect sensitive information. Ongoing user studies with scientists are underway to evaluate the framework's usability, trustworthiness, and effectiveness in tackling real-world privacy challenges.