Saved in:
| Main Author: | Aggarwal, Arpit |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.04585 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RePo: Language Models with Context Re-Positioning
by: Li, Huayang, et al.
Published: (2025)
by: Li, Huayang, et al.
Published: (2025)
SeqPE: Transformer with Sequential Position Encoding
by: Li, Huayang, et al.
Published: (2025)
by: Li, Huayang, et al.
Published: (2025)
HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
by: Chen, Yuhan, et al.
Published: (2024)
by: Chen, Yuhan, et al.
Published: (2024)
Group Representational Position Encoding
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
by: Zhuo, Zhijian, et al.
Published: (2024)
by: Zhuo, Zhijian, et al.
Published: (2024)
Hierarchical Orthogonal Residual Spread for Precise Massive Editing in Large Language Models
by: Gu, Xiaojie, et al.
Published: (2026)
by: Gu, Xiaojie, et al.
Published: (2026)
Fourier Head: Helping Large Language Models Learn Complex Probability Distributions
by: Gillman, Nate, et al.
Published: (2024)
by: Gillman, Nate, et al.
Published: (2024)
Rotary Positional Embeddings as Phase Modulation: Theoretical Bounds on the RoPE Base for Long-Context Transformers
by: Liu, Feilong
Published: (2026)
by: Liu, Feilong
Published: (2026)
Position Engineering: Boosting Large Language Models through Positional Information Manipulation
by: He, Zhiyuan, et al.
Published: (2024)
by: He, Zhiyuan, et al.
Published: (2024)
RoPE Distinguishes Neither Positions Nor Tokens in Long Contexts, Provably
by: Du, Yufeng, et al.
Published: (2026)
by: Du, Yufeng, et al.
Published: (2026)
ORION: Teaching Language Models to Reason Efficiently in the Language of Thought
by: Tanmay, Kumar, et al.
Published: (2025)
by: Tanmay, Kumar, et al.
Published: (2025)
Extracting and Encoding: Leveraging Large Language Models and Medical Knowledge to Enhance Radiological Text Representation
by: Messina, Pablo, et al.
Published: (2024)
by: Messina, Pablo, et al.
Published: (2024)
Group-Aware Reinforcement Learning for Output Diversity in Large Language Models
by: Anschel, Oron, et al.
Published: (2025)
by: Anschel, Oron, et al.
Published: (2025)
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
by: Aggarwal, Pranjal, et al.
Published: (2025)
by: Aggarwal, Pranjal, et al.
Published: (2025)
HoPE: Hyperbolic Rotary Positional Encoding for Stable Long-Range Dependency Modeling in Large Language Models
by: Dai, Chang, et al.
Published: (2025)
by: Dai, Chang, et al.
Published: (2025)
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation
by: He, Zhenyu, et al.
Published: (2024)
by: He, Zhenyu, et al.
Published: (2024)
CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs
by: Li, Haoran, et al.
Published: (2026)
by: Li, Haoran, et al.
Published: (2026)
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling
by: Limisiewicz, Tomasz, et al.
Published: (2024)
by: Limisiewicz, Tomasz, et al.
Published: (2024)
Needle in the Haystack for Memory Based Large Language Models
by: Nelson, Elliot, et al.
Published: (2024)
by: Nelson, Elliot, et al.
Published: (2024)
PoTPTQ: A Two-step Power-of-Two Post-training for LLMs
by: Wang, Xinyu, et al.
Published: (2025)
by: Wang, Xinyu, et al.
Published: (2025)
Sequential Large Language Model-Based Hyper-parameter Optimization
by: Mahammadli, Kanan, et al.
Published: (2024)
by: Mahammadli, Kanan, et al.
Published: (2024)
Self-Supervised Position Debiasing for Large Language Models
by: Liu, Zhongkun, et al.
Published: (2024)
by: Liu, Zhongkun, et al.
Published: (2024)
Struc-EMB: The Potential of Structure-Aware Encoding in Language Embeddings
by: Liu, Shikun, et al.
Published: (2025)
by: Liu, Shikun, et al.
Published: (2025)
PoPreRo: A New Dataset for Popularity Prediction of Romanian Reddit Posts
by: Rogoz, Ana-Cristina, et al.
Published: (2024)
by: Rogoz, Ana-Cristina, et al.
Published: (2024)
Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents
by: Kirchhof, Michael, et al.
Published: (2025)
by: Kirchhof, Michael, et al.
Published: (2025)
From Construction to Injection: Edit-Based Fingerprints for Large Language Models
by: Li, Yue, et al.
Published: (2025)
by: Li, Yue, et al.
Published: (2025)
Demystifying the Slash Pattern in Attention: The Role of RoPE
by: Cheng, Yuan, et al.
Published: (2026)
by: Cheng, Yuan, et al.
Published: (2026)
Unraveling LoRA Interference: Orthogonal Subspaces for Robust Model Merging
by: Zhang, Haobo, et al.
Published: (2025)
by: Zhang, Haobo, et al.
Published: (2025)
Method-Based Reasoning for Large Language Models: Extraction, Reuse, and Continuous Improvement
by: Su, Hong
Published: (2025)
by: Su, Hong
Published: (2025)
PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models
by: Xu, Yinggan, et al.
Published: (2025)
by: Xu, Yinggan, et al.
Published: (2025)
CASCADE: Case-Based Continual Adaptation for Large Language Models During Deployment
by: Guo, Siyuan, et al.
Published: (2026)
by: Guo, Siyuan, et al.
Published: (2026)
Large Language Model Pruning
by: Huang, Hanjuan, et al.
Published: (2024)
by: Huang, Hanjuan, et al.
Published: (2024)
Causality for Large Language Models
by: Wu, Anpeng, et al.
Published: (2024)
by: Wu, Anpeng, et al.
Published: (2024)
Large Language Model Unlearning
by: Yao, Yuanshun, et al.
Published: (2023)
by: Yao, Yuanshun, et al.
Published: (2023)
Foundations of Large Language Models
by: Xiao, Tong, et al.
Published: (2025)
by: Xiao, Tong, et al.
Published: (2025)
Large Language Models as Optimizers
by: Yang, Chengrun, et al.
Published: (2023)
by: Yang, Chengrun, et al.
Published: (2023)
Orthogonal Finetuning for Direct Preference Optimization
by: Yang, Chenxu, et al.
Published: (2024)
by: Yang, Chenxu, et al.
Published: (2024)
The Geometries of Truth Are Orthogonal Across Tasks
by: Azizian, Waiss, et al.
Published: (2025)
by: Azizian, Waiss, et al.
Published: (2025)
BAPO: Base-Anchored Preference Optimization for Overcoming Forgetting in Large Language Models Personalization
by: Lee, Gihun, et al.
Published: (2024)
by: Lee, Gihun, et al.
Published: (2024)
SEE: Sememe Entanglement Encoding for Transformer-bases Models Compression
by: Zhang, Jing, et al.
Published: (2024)
by: Zhang, Jing, et al.
Published: (2024)
Similar Items
-
RePo: Language Models with Context Re-Positioning
by: Li, Huayang, et al.
Published: (2025) -
SeqPE: Transformer with Sequential Position Encoding
by: Li, Huayang, et al.
Published: (2025) -
HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
by: Chen, Yuhan, et al.
Published: (2024) -
Group Representational Position Encoding
by: Zhang, Yifan, et al.
Published: (2025) -
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
by: Zhuo, Zhijian, et al.
Published: (2024)