Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Smith, Ellery, Paloots, Rahel, Giagkos, Dimitris, Baudis, Michael, Stockinger, Kurt
Format:	Preprint
Published:	2023
Subjects:	Computation and Language Computational Engineering, Finance, and Science Databases
Online Access:	https://arxiv.org/abs/2307.00933
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911774489444352
author	Smith, Ellery Paloots, Rahel Giagkos, Dimitris Baudis, Michael Stockinger, Kurt
author_facet	Smith, Ellery Paloots, Rahel Giagkos, Dimitris Baudis, Michael Stockinger, Kurt
contents	With the proliferation of research means and computational methodologies, published biomedical literature is growing exponentially in numbers and volume. Cancer cell lines are frequently used models in biological and medical research that are currently applied for a wide range of purposes, from studies of cellular mechanisms to drug development, which has led to a wealth of related data and publications. Sifting through large quantities of text to gather relevant information on the cell lines of interest is tedious and extremely slow when performed by humans. Hence, novel computational information extraction and correlation mechanisms are required to boost meaningful knowledge extraction. In this work, we present the design, implementation and application of a novel data extraction and exploration system. This system extracts deep semantic relations between textual entities from scientific literature to enrich existing structured clinical data in the domain of cancer cell lines. We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities such as affected genes. Each relation is accompanied by literature-derived evidences, allowing for deep, yet rapid, literature search, using existing structured data as a springboard. Our system is publicly available on the web at https://cancercelllines.org
format	Preprint
id	arxiv_https___arxiv_org_abs_2307_00933
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Data-Driven Information Extraction and Enrichment of Molecular Profiling Data for Cancer Cell Lines Smith, Ellery Paloots, Rahel Giagkos, Dimitris Baudis, Michael Stockinger, Kurt Computation and Language Computational Engineering, Finance, and Science Databases With the proliferation of research means and computational methodologies, published biomedical literature is growing exponentially in numbers and volume. Cancer cell lines are frequently used models in biological and medical research that are currently applied for a wide range of purposes, from studies of cellular mechanisms to drug development, which has led to a wealth of related data and publications. Sifting through large quantities of text to gather relevant information on the cell lines of interest is tedious and extremely slow when performed by humans. Hence, novel computational information extraction and correlation mechanisms are required to boost meaningful knowledge extraction. In this work, we present the design, implementation and application of a novel data extraction and exploration system. This system extracts deep semantic relations between textual entities from scientific literature to enrich existing structured clinical data in the domain of cancer cell lines. We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities such as affected genes. Each relation is accompanied by literature-derived evidences, allowing for deep, yet rapid, literature search, using existing structured data as a springboard. Our system is publicly available on the web at https://cancercelllines.org
title	Data-Driven Information Extraction and Enrichment of Molecular Profiling Data for Cancer Cell Lines
topic	Computation and Language Computational Engineering, Finance, and Science Databases
url	https://arxiv.org/abs/2307.00933

Similar Items