Saved in:
| Main Authors: | Srivastava, Prerak, Corallo, Giulio, Rybalko, Sergey |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.01147 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning
by: Corallo, Giulio, et al.
Published: (2025)
by: Corallo, Giulio, et al.
Published: (2025)
Implicit Word Reordering with Knowledge Distillation for Cross-Lingual Dependency Parsing
by: Li, Zhuoran, et al.
Published: (2025)
by: Li, Zhuoran, et al.
Published: (2025)
BitNet a4.8: 4-bit Activations for 1-bit LLMs
by: Wang, Hongyu, et al.
Published: (2024)
by: Wang, Hongyu, et al.
Published: (2024)
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs
by: Wang, Hongyu, et al.
Published: (2025)
by: Wang, Hongyu, et al.
Published: (2025)
Parallel Context-of-Experts Decoding for Retrieval Augmented Generation
by: Corallo, Giulio, et al.
Published: (2026)
by: Corallo, Giulio, et al.
Published: (2026)
Continuously Learning New Words in Automatic Speech Recognition
by: Huber, Christian, et al.
Published: (2024)
by: Huber, Christian, et al.
Published: (2024)
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling
by: Huang, Hongzhi, et al.
Published: (2025)
by: Huang, Hongzhi, et al.
Published: (2025)
AAAC: Activation-Aware Adaptive Codebooks for 4-bit LLM Weight Quantization
by: IslamBouli, Beshr, et al.
Published: (2026)
by: IslamBouli, Beshr, et al.
Published: (2026)
An Image is Worth Multiple Words: Discovering Object Level Concepts using Multi-Concept Prompt Learning
by: Jin, Chen, et al.
Published: (2023)
by: Jin, Chen, et al.
Published: (2023)
Pruning Literals for Highly Efficient Explainability at Word Level
by: Yadav, Rohan Kumar, et al.
Published: (2024)
by: Yadav, Rohan Kumar, et al.
Published: (2024)
Empirical Study of Named Entity Recognition Performance Using Distribution-aware Word Embedding
by: Chen, Xin, et al.
Published: (2021)
by: Chen, Xin, et al.
Published: (2021)
Adapting Large Language Models for Parameter-Efficient Log Anomaly Detection
by: Lim, Ying Fu, et al.
Published: (2025)
by: Lim, Ying Fu, et al.
Published: (2025)
ArEEG_Words: Dataset for Envisioned Speech Recognition using EEG for Arabic Words
by: Darwish, Hazem, et al.
Published: (2024)
by: Darwish, Hazem, et al.
Published: (2024)
Urdu Dependency Parsing and Treebank Development: A Syntactic and Morphological Perspective
by: Habib, Nudrat
Published: (2024)
by: Habib, Nudrat
Published: (2024)
Improving Block-Wise LLM Quantization by 4-bit Block-Wise Optimal Float (BOF4): Analysis and Variations
by: Blumenberg, Patrick, et al.
Published: (2025)
by: Blumenberg, Patrick, et al.
Published: (2025)
Is Escalation Worth It? A Decision-Theoretic Characterization of LLM Cascades
by: Bouchard, Dylan
Published: (2026)
by: Bouchard, Dylan
Published: (2026)
BitDelta: Your Fine-Tune May Only Be Worth One Bit
by: Liu, James, et al.
Published: (2024)
by: Liu, James, et al.
Published: (2024)
BinaryPPO: Efficient Policy Optimization for Binary Classification
by: Pandey, Punya Syon, et al.
Published: (2026)
by: Pandey, Punya Syon, et al.
Published: (2026)
AMAQ: Adaptive Mixed-bit Activation Quantization for Collaborative Parameter Efficient Fine-tuning
by: Song, Yurun, et al.
Published: (2025)
by: Song, Yurun, et al.
Published: (2025)
C-ing Clearly: Enhanced Binary Code Explanations using C code
by: Poncu, Teodor, et al.
Published: (2025)
by: Poncu, Teodor, et al.
Published: (2025)
Ensemble Distillation for Unsupervised Constituency Parsing
by: Shayegh, Behzad, et al.
Published: (2023)
by: Shayegh, Behzad, et al.
Published: (2023)
A Truly Joint Neural Architecture for Segmentation and Parsing
by: Levi, Danit Yshaayahu, et al.
Published: (2024)
by: Levi, Danit Yshaayahu, et al.
Published: (2024)
Neural-Bayesian Program Learning for Few-shot Dialogue Intent Parsing
by: Hong, Mengze, et al.
Published: (2024)
by: Hong, Mengze, et al.
Published: (2024)
How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models
by: Schwethelm, Kristian, et al.
Published: (2026)
by: Schwethelm, Kristian, et al.
Published: (2026)
Is More Data Worth the Cost? Dataset Scaling Laws in a Tiny Attention-Only Decoder
by: Wiegand, Götz-Henrik, et al.
Published: (2026)
by: Wiegand, Götz-Henrik, et al.
Published: (2026)
AdaParse: An Adaptive Parallel PDF Parsing and Resource Scaling Engine
by: Siebenschuh, Carlo, et al.
Published: (2025)
by: Siebenschuh, Carlo, et al.
Published: (2025)
When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization
by: Nielsen, Jacob, et al.
Published: (2024)
by: Nielsen, Jacob, et al.
Published: (2024)
Adapting Abstract Meaning Representation Parsing to the Clinical Narrative -- the SPRING THYME parser
by: Cai, Jon Z., et al.
Published: (2024)
by: Cai, Jon Z., et al.
Published: (2024)
Debugging Tabular Log as Dynamic Graphs
by: Liang, Chumeng, et al.
Published: (2025)
by: Liang, Chumeng, et al.
Published: (2025)
Log-linear Guardedness and its Implications
by: Ravfogel, Shauli, et al.
Published: (2022)
by: Ravfogel, Shauli, et al.
Published: (2022)
Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
by: Zhu, Wenjing, et al.
Published: (2024)
by: Zhu, Wenjing, et al.
Published: (2024)
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
by: Liu, Zirui, et al.
Published: (2024)
by: Liu, Zirui, et al.
Published: (2024)
Programming by Backprop: An Instruction is Worth 100 Examples When Finetuning LLMs
by: Cook, Jonathan, et al.
Published: (2025)
by: Cook, Jonathan, et al.
Published: (2025)
Turn Waste into Worth: Rectifying Top-$k$ Router of MoE
by: Zeng, Zhiyuan, et al.
Published: (2024)
by: Zeng, Zhiyuan, et al.
Published: (2024)
Tree Matching Networks for Natural Language Inference: Parameter-Efficient Semantic Understanding via Dependency Parse Trees
by: Lunder, Jason
Published: (2025)
by: Lunder, Jason
Published: (2025)
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
by: Ma, Shuming, et al.
Published: (2024)
by: Ma, Shuming, et al.
Published: (2024)
SWE2: SubWord Enriched and Significant Word Emphasized Framework for Hate Speech Detection
by: Mou, Guanyi, et al.
Published: (2024)
by: Mou, Guanyi, et al.
Published: (2024)
Learning Page Order in Shuffled WOO Releases
by: Kahraman, Efe, et al.
Published: (2026)
by: Kahraman, Efe, et al.
Published: (2026)
Let Me Think! A Long Chain-of-Thought Can Be Worth Exponentially Many Short Ones
by: Mirtaheri, Parsa, et al.
Published: (2025)
by: Mirtaheri, Parsa, et al.
Published: (2025)
What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
by: Choe, Sang Keun, et al.
Published: (2024)
by: Choe, Sang Keun, et al.
Published: (2024)
Similar Items
-
Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning
by: Corallo, Giulio, et al.
Published: (2025) -
Implicit Word Reordering with Knowledge Distillation for Cross-Lingual Dependency Parsing
by: Li, Zhuoran, et al.
Published: (2025) -
BitNet a4.8: 4-bit Activations for 1-bit LLMs
by: Wang, Hongyu, et al.
Published: (2024) -
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs
by: Wang, Hongyu, et al.
Published: (2025) -
Parallel Context-of-Experts Decoding for Retrieval Augmented Generation
by: Corallo, Giulio, et al.
Published: (2026)