:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zou, Yuchun, Tong, Junhong, Li, Jun
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2605.29000
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The Lossy Horizon: Error-Bounded Predictive Coding for Lossy Text Compression (Episode I)
by: Aghanya, Nnamdi, et al.
Published: (2025)

SemanticZip: A Pilot Framework for Lossy Text Compression with LLMs as Semantic Decompressors
by: Trukhina, Natalia, et al.
Published: (2026)

Learning is Forgetting: LLM Training As Lossy Compression
by: Conklin, Henry C., et al.
Published: (2026)

Wikipedia is Not a Dictionary, Delete! Text Classification as a Proxy for Analysing Wiki Deletion Discussions
by: Borkakoty, Hsuvas, et al.
Published: (2025)

Adverb Is the Key: Simple Text Data Augmentation with Adverb Deletion
by: Choi, Juhwan, et al.
Published: (2024)

Anonymous-by-Construction: An LLM-Driven Framework for Privacy-Preserving Text
by: Albanese, Federico, et al.
Published: (2026)

Text Compression for Efficient Language Generation
by: Gu, David, et al.
Published: (2025)

Fact-Preserved Personalized News Headline Generation
by: Yang, Zhao, et al.
Published: (2025)

Context Cascade Compression: Exploring the Upper Limits of Text Compression
by: Liu, Fanfan, et al.
Published: (2025)

VTC-R1: Vision-Text Compression for Efficient Long-Context Reasoning
by: Wang, Yibo, et al.
Published: (2026)

Look Ahead Text Understanding and LLM Stitching
by: Jiang, Junlin Julian, et al.
Published: (2024)

Multi-LLM Text Summarization
by: Fang, Jiangnan, et al.
Published: (2024)

The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
by: Tang, Zhenheng, et al.
Published: (2025)

Rethinking the Privacy of Text Embeddings: A Reproducibility Study of "Text Embeddings Reveal (Almost) As Much As Text"
by: Seputis, Dominykas, et al.
Published: (2025)

Beyond Text Compression: Evaluating Tokenizers Across Scales
by: Lotz, Jonas F., et al.
Published: (2025)

Hypernym Mercury: Token Optimization Through Semantic Field Constriction And Reconstruction From Hypernyms. A New Text Compression Method
by: Forrester, Chris, et al.
Published: (2025)

Advancing NLP Models with Strategic Text Augmentation: A Comprehensive Study of Augmentation Methods and Curriculum Strategies
by: Kesgin, Himmet Toprak, et al.
Published: (2024)

On the Detectability of LLM-Generated Text: What Exactly Is LLM-Generated Text?
by: Geng, Mingmeng, et al.
Published: (2025)

Pushing The Limit of LLM Capacity for Text Classification
by: Zhang, Yazhou, et al.
Published: (2024)

Training LLMs over Neurally Compressed Text
by: Lester, Brian, et al.
Published: (2024)

TextQuests: How Good are LLMs at Text-Based Video Games?
by: Phan, Long, et al.
Published: (2025)

StyleDecipher: Robust and Explainable Detection of LLM-Generated Texts with Stylistic Analysis
by: Li, Siyuan, et al.
Published: (2025)

RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL
by: Wu, Zhenhe, et al.
Published: (2024)

Assessing Human Editing Effort on LLM-Generated Texts via Compression-Based Edit Distance
by: Devatine, Nicolas, et al.
Published: (2024)

Beyond Easy Wins: A Text Hardness-Aware Benchmark for LLM-generated Text Detection
by: Ayoobi, Navid, et al.
Published: (2025)

On Preserving the Knowledge of Long Clinical Texts
by: Hasan, Mohammad Junayed, et al.
Published: (2023)

The Text Uncanny Valley: Non-Monotonic Performance Degradation in LLM Information Retrieval
by: Tong, Zekai, et al.
Published: (2026)

Cosmos: Compressed and Smooth Latent Space for Text Diffusion Modeling
by: Meshchaninov, Viacheslav, et al.
Published: (2025)

Learning to Rewrite: Generalized LLM-Generated Text Detection
by: Li, Ran, et al.
Published: (2024)

QUDsim: Quantifying Discourse Similarities in LLM-Generated Text
by: Namuduri, Ramya, et al.
Published: (2025)

Model-Agnostic Sentiment Distribution Stability Analysis for Robust LLM-Generated Texts Detection
by: Li, Siyuan, et al.
Published: (2025)

ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference
by: Liu, Xiang, et al.
Published: (2025)

NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human
by: Huang, Shuo, et al.
Published: (2024)

DSIPA: Detecting LLM-Generated Texts via Sentiment-Invariant Patterns Divergence Analysis
by: Li, Siyuan, et al.
Published: (2026)

Robust Utility-Preserving Text Anonymization Based on Large Language Models
by: Yang, Tianyu, et al.
Published: (2024)

Current State in Privacy-Preserving Text Preprocessing for Domain-Agnostic NLP
by: Sinha, Abhirup, et al.
Published: (2025)

TextGrad: Automatic "Differentiation" via Text
by: Yuksekgonul, Mert, et al.
Published: (2024)

Hierarchical Text Classification with LLM-Refined Taxonomies
by: Golde, Jonas, et al.
Published: (2026)

SliceGPT: Compress Large Language Models by Deleting Rows and Columns
by: Ashkboos, Saleh, et al.
Published: (2024)

Enhancing Medication Recommendation with LLM Text Representation
by: Lee, Yu-Tzu
Published: (2024)