Saved in:
Bibliographic Details
Main Author: Zilberman, Mark
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.03059
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911038350295040
author Zilberman, Mark
author_facet Zilberman, Mark
contents This article explores the feasibility of creating an "electronic copy" of a deceased researcher by training artificial intelligence (AI) on the data stored in their personal computers. By analyzing typical data volumes on inherited researcher computers, including textual files such as articles, emails, and drafts, it is estimated that approximately one million words are available for AI training. This volume is sufficient for fine-tuning advanced pre-trained models like GPT-4 to replicate a researcher's writing style, domain expertise, and rhetorical voice with high fidelity. The study also discusses the potential enhancements from including non-textual data and file metadata to enrich the AI's representation of the researcher. Extensions of the concept include communication between living researchers and their electronic copies, collaboration among individual electronic copies, as well as the creation and interconnection of organizational electronic copies to optimize information access and strategic decision-making. Ethical considerations such as ownership and security of these electronic copies are highlighted as critical for responsible implementation. The findings suggest promising opportunities for AI-driven preservation and augmentation of intellectual legacy.
format Preprint
id arxiv_https___arxiv_org_abs_2507_03059
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle AI-Based Reconstruction from Inherited Personal Data: Analysis, Feasibility, and Prospects
Zilberman, Mark
Computers and Society
Artificial Intelligence
This article explores the feasibility of creating an "electronic copy" of a deceased researcher by training artificial intelligence (AI) on the data stored in their personal computers. By analyzing typical data volumes on inherited researcher computers, including textual files such as articles, emails, and drafts, it is estimated that approximately one million words are available for AI training. This volume is sufficient for fine-tuning advanced pre-trained models like GPT-4 to replicate a researcher's writing style, domain expertise, and rhetorical voice with high fidelity. The study also discusses the potential enhancements from including non-textual data and file metadata to enrich the AI's representation of the researcher. Extensions of the concept include communication between living researchers and their electronic copies, collaboration among individual electronic copies, as well as the creation and interconnection of organizational electronic copies to optimize information access and strategic decision-making. Ethical considerations such as ownership and security of these electronic copies are highlighted as critical for responsible implementation. The findings suggest promising opportunities for AI-driven preservation and augmentation of intellectual legacy.
title AI-Based Reconstruction from Inherited Personal Data: Analysis, Feasibility, and Prospects
topic Computers and Society
Artificial Intelligence
url https://arxiv.org/abs/2507.03059