Saved in:
Bibliographic Details
Main Author: Chen, Yihong
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.00949
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914015286919168
author Chen, Yihong
author_facet Chen, Yihong
contents The making of knowledge engines in natural language processing has been shaped by two seemingly distinct paradigms: one grounded in structure, the other driven by massively available unstructured data. The structured paradigm leverages predefined symbolic interactions, such as knowledge graphs, as priors and designs models to capture them. In contrast, the unstructured paradigm centers on scaling transformer architectures with increasingly vast data and model sizes, as seen in modern large language models. Despite their divergence, this thesis seeks to establish conceptual connections bridging these paradigms. Two complementary forces, structure and destructure, emerge across both paradigms: structure organizes seen symbolic interactions, while destructure, through periodic embedding resets, improves model plasticity and generalization to unseen scenarios. These connections form a new recipe for developing general knowledge engines that can support transparent, controllable, and adaptable intelligent systems.
format Preprint
id arxiv_https___arxiv_org_abs_2509_00949
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Structure and Destructure: Dual Forces in the Making of Knowledge Engines
Chen, Yihong
Computation and Language
Artificial Intelligence
68T05, 68T30, 68T50
I.2.6; I.2.7; H.3.3; K.8.0
The making of knowledge engines in natural language processing has been shaped by two seemingly distinct paradigms: one grounded in structure, the other driven by massively available unstructured data. The structured paradigm leverages predefined symbolic interactions, such as knowledge graphs, as priors and designs models to capture them. In contrast, the unstructured paradigm centers on scaling transformer architectures with increasingly vast data and model sizes, as seen in modern large language models. Despite their divergence, this thesis seeks to establish conceptual connections bridging these paradigms. Two complementary forces, structure and destructure, emerge across both paradigms: structure organizes seen symbolic interactions, while destructure, through periodic embedding resets, improves model plasticity and generalization to unseen scenarios. These connections form a new recipe for developing general knowledge engines that can support transparent, controllable, and adaptable intelligent systems.
title Structure and Destructure: Dual Forces in the Making of Knowledge Engines
topic Computation and Language
Artificial Intelligence
68T05, 68T30, 68T50
I.2.6; I.2.7; H.3.3; K.8.0
url https://arxiv.org/abs/2509.00949