Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.08058 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866915287739138048 |
|---|---|
| author | Forrester, Chris Sulea, Octavia |
| author_facet | Forrester, Chris Sulea, Octavia |
| contents | Compute optimization using token reduction of LLM prompts is an emerging task in the fields of NLP and next generation, agentic AI. In this white paper, we introduce a novel (patent pending) text representation scheme and a first-of-its-kind word-level semantic compression of paragraphs that can lead to over 90% token reduction, while retaining high semantic similarity to the source text. We explain how this novel compression technique can be lossless and how the detail granularity is controllable. We discuss benchmark results over open source data (i.e. Bram Stoker's Dracula available through Project Gutenberg) and show how our results hold at the paragraph level, across multiple genres and models. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2505_08058 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Hypernym Mercury: Token Optimization Through Semantic Field Constriction And Reconstruction From Hypernyms. A New Text Compression Method Forrester, Chris Sulea, Octavia Computation and Language Compute optimization using token reduction of LLM prompts is an emerging task in the fields of NLP and next generation, agentic AI. In this white paper, we introduce a novel (patent pending) text representation scheme and a first-of-its-kind word-level semantic compression of paragraphs that can lead to over 90% token reduction, while retaining high semantic similarity to the source text. We explain how this novel compression technique can be lossless and how the detail granularity is controllable. We discuss benchmark results over open source data (i.e. Bram Stoker's Dracula available through Project Gutenberg) and show how our results hold at the paragraph level, across multiple genres and models. |
| title | Hypernym Mercury: Token Optimization Through Semantic Field Constriction And Reconstruction From Hypernyms. A New Text Compression Method |
| topic | Computation and Language |
| url | https://arxiv.org/abs/2505.08058 |