Saved in:
Bibliographic Details
Main Author: Orekhov, Boris
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2407.08099
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916319899680768
author Orekhov, Boris
author_facet Orekhov, Boris
contents Burrows' Delta was introduced in 2002 and has proven to be an effective tool for author attribution. Despite the fact that these are different languages, they mostly belong to the same grammatical type and use the same graphic principle to convey speech in writing: a phonemic alphabet with word separation using spaces. The question I want to address in this article is how well this attribution method works with texts in a language with a different grammatical structure and a script based on different principles. There are fewer studies analyzing the effectiveness of the Delta method on Chinese texts than on texts in European languages. I believe that such a low level of attention to Delta from sinologists is due to the structure of the scientific field dedicated to medieval Chinese poetry. Clustering based on intertextual distances worked flawlessly. Delta produced results where clustering showed that the samples of one author were most similar to each other, and Delta never confused different poets. Despite the fact that I used an unconventional approach and applied the Delta method to a language poorly suited for it, the method demonstrated its effectiveness. Tang dynasty poets are correctly identified using Delta, and the empirical pattern observed for authors writing in European standard languages has been confirmed once again.
format Preprint
id arxiv_https___arxiv_org_abs_2407_08099
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle How does Burrows' Delta work on medieval Chinese poetic texts?
Orekhov, Boris
Computation and Language
Burrows' Delta was introduced in 2002 and has proven to be an effective tool for author attribution. Despite the fact that these are different languages, they mostly belong to the same grammatical type and use the same graphic principle to convey speech in writing: a phonemic alphabet with word separation using spaces. The question I want to address in this article is how well this attribution method works with texts in a language with a different grammatical structure and a script based on different principles. There are fewer studies analyzing the effectiveness of the Delta method on Chinese texts than on texts in European languages. I believe that such a low level of attention to Delta from sinologists is due to the structure of the scientific field dedicated to medieval Chinese poetry. Clustering based on intertextual distances worked flawlessly. Delta produced results where clustering showed that the samples of one author were most similar to each other, and Delta never confused different poets. Despite the fact that I used an unconventional approach and applied the Delta method to a language poorly suited for it, the method demonstrated its effectiveness. Tang dynasty poets are correctly identified using Delta, and the empirical pattern observed for authors writing in European standard languages has been confirmed once again.
title How does Burrows' Delta work on medieval Chinese poetic texts?
topic Computation and Language
url https://arxiv.org/abs/2407.08099