Saved in:
| Main Authors: | Baral, Aditeya, Ajith, Allen George, Nayak, Roshan, Bhanja, Mrityunjay Abhijeet |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.12587 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Can LLMs $\textit{understand}$ Math? -- Exploring the Pitfalls in Mathematical Reasoning
by: Roy, Tiasa Singha, et al.
Published: (2025)
by: Roy, Tiasa Singha, et al.
Published: (2025)
CodeMixBench: Evaluating Large Language Models on Code Generation with Code-Mixed Prompts
by: Sheokand, Manik, et al.
Published: (2025)
by: Sheokand, Manik, et al.
Published: (2025)
Pooling Attention: Evaluating Pretrained Transformer Embeddings for Deception Classification
by: Mamtani, Sumit, et al.
Published: (2025)
by: Mamtani, Sumit, et al.
Published: (2025)
Learning to Decode Collaboratively with Multiple Language Models
by: Shen, Shannon Zejiang, et al.
Published: (2024)
by: Shen, Shannon Zejiang, et al.
Published: (2024)
CS-Sum: A Benchmark for Code-Switching Dialogue Summarization and the Limits of Large Language Models
by: Suresh, Sathya Krishnan, et al.
Published: (2025)
by: Suresh, Sathya Krishnan, et al.
Published: (2025)
Evaluating the Effectiveness of Pre-trained Language Models in Predicting the Helpfulness of Online Product Reviews
by: Boluki, Ali, et al.
Published: (2023)
by: Boluki, Ali, et al.
Published: (2023)
Entropy Adaptive Decoding: Dynamic Model Switching for Efficient Inference
by: Simonds, Toby
Published: (2025)
by: Simonds, Toby
Published: (2025)
Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models
by: Sakarvadia, Mansi, et al.
Published: (2023)
by: Sakarvadia, Mansi, et al.
Published: (2023)
SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model
by: Zhang, Wencheng, et al.
Published: (2025)
by: Zhang, Wencheng, et al.
Published: (2025)
Code-Switching Curriculum Learning for Multilingual Transfer in LLMs
by: Yoo, Haneul, et al.
Published: (2024)
by: Yoo, Haneul, et al.
Published: (2024)
How Powerful are Decoder-Only Transformer Neural Models?
by: Roberts, Jesse
Published: (2023)
by: Roberts, Jesse
Published: (2023)
Are More Tokens Rational? Inference-Time Scaling in Language Models as Adaptive Resource Rationality
by: Hu, Zhimin, et al.
Published: (2026)
by: Hu, Zhimin, et al.
Published: (2026)
Adapting Language Balance in Code-Switching Speech
by: Ugan, Enes Yavuz, et al.
Published: (2025)
by: Ugan, Enes Yavuz, et al.
Published: (2025)
Comparative Study of Pre-Trained BERT and Large Language Models for Code-Mixed Named Entity Recognition
by: Shirke, Mayur, et al.
Published: (2025)
by: Shirke, Mayur, et al.
Published: (2025)
DualDiffusion: A Speculative Decoding Strategy for Masked Diffusion Models
by: Goyal, Satyam, et al.
Published: (2026)
by: Goyal, Satyam, et al.
Published: (2026)
Progressive Mixed-Precision Decoding for Efficient LLM Inference
by: Chen, Hao Mark, et al.
Published: (2024)
by: Chen, Hao Mark, et al.
Published: (2024)
Decoding Uncertainty: The Impact of Decoding Strategies for Uncertainty Estimation in Large Language Models
by: Hashimoto, Wataru, et al.
Published: (2025)
by: Hashimoto, Wataru, et al.
Published: (2025)
ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models
by: Patel, Maitreya, et al.
Published: (2023)
by: Patel, Maitreya, et al.
Published: (2023)
Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation
by: Kartik, Kartik, et al.
Published: (2024)
by: Kartik, Kartik, et al.
Published: (2024)
Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding
by: Jin, Tian, et al.
Published: (2025)
by: Jin, Tian, et al.
Published: (2025)
Learning to Explain: Supervised Token Attribution from Transformer Attention Patterns
by: Mihaila, George
Published: (2026)
by: Mihaila, George
Published: (2026)
CHAI for LLMs: Improving Code-Mixed Translation in Large Language Models through Reinforcement Learning with AI Feedback
by: Zhang, Wenbo, et al.
Published: (2024)
by: Zhang, Wenbo, et al.
Published: (2024)
Stability-Weighted Decoding for Diffusion Language Models
by: Wu, Yue, et al.
Published: (2026)
by: Wu, Yue, et al.
Published: (2026)
Training for Compositional Sensitivity Reduces Dense Retrieval Generalization
by: Ralev, Radoslav, et al.
Published: (2026)
by: Ralev, Radoslav, et al.
Published: (2026)
On The Adaptation of Unlimiformer for Decoder-Only Transformers
by: Ahrabian, Kian, et al.
Published: (2024)
by: Ahrabian, Kian, et al.
Published: (2024)
The Dual-Stream Transformer: Channelized Architecture for Interpretable Language Modeling
by: Kerce, J. Clayton, et al.
Published: (2026)
by: Kerce, J. Clayton, et al.
Published: (2026)
DIVERS-Bench: Evaluating Language Identification Across Domain Shifts and Code-Switching
by: Ojo, Jessica, et al.
Published: (2025)
by: Ojo, Jessica, et al.
Published: (2025)
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
by: Allen-Zhu, Zeyuan, et al.
Published: (2023)
by: Allen-Zhu, Zeyuan, et al.
Published: (2023)
Stress Detection on Code-Mixed Texts in Dravidian Languages using Machine Learning
by: Ramos, L., et al.
Published: (2024)
by: Ramos, L., et al.
Published: (2024)
PIER: A Novel Metric for Evaluating What Matters in Code-Switching
by: Ugan, Enes Yavuz, et al.
Published: (2025)
by: Ugan, Enes Yavuz, et al.
Published: (2025)
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
by: De, Soham, et al.
Published: (2024)
by: De, Soham, et al.
Published: (2024)
MaxCode: A Max-Reward Reinforcement Learning Framework for Automated Code Optimization
by: Ou, Jiefu, et al.
Published: (2026)
by: Ou, Jiefu, et al.
Published: (2026)
FlashDecoding++: Faster Large Language Model Inference on GPUs
by: Hong, Ke, et al.
Published: (2023)
by: Hong, Ke, et al.
Published: (2023)
Recurrent Drafter for Fast Speculative Decoding in Large Language Models
by: Cheng, Yunfei, et al.
Published: (2024)
by: Cheng, Yunfei, et al.
Published: (2024)
Decoding Rarity: Large Language Models in the Diagnosis of Rare Diseases
by: Carbonari, Valentina, et al.
Published: (2025)
by: Carbonari, Valentina, et al.
Published: (2025)
Speculative Decoding Across Languages
by: Paudel, Nirajan, et al.
Published: (2026)
by: Paudel, Nirajan, et al.
Published: (2026)
ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models
by: Liu, Jing, et al.
Published: (2024)
by: Liu, Jing, et al.
Published: (2024)
Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment
by: Nayak, Kota Shamanth Ramanath, et al.
Published: (2024)
by: Nayak, Kota Shamanth Ramanath, et al.
Published: (2024)
Detecting Pretraining Data from Large Language Models
by: Shi, Weijia, et al.
Published: (2023)
by: Shi, Weijia, et al.
Published: (2023)
RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models
by: Huang, Jie, et al.
Published: (2023)
by: Huang, Jie, et al.
Published: (2023)
Similar Items
-
Can LLMs $\textit{understand}$ Math? -- Exploring the Pitfalls in Mathematical Reasoning
by: Roy, Tiasa Singha, et al.
Published: (2025) -
CodeMixBench: Evaluating Large Language Models on Code Generation with Code-Mixed Prompts
by: Sheokand, Manik, et al.
Published: (2025) -
Pooling Attention: Evaluating Pretrained Transformer Embeddings for Deception Classification
by: Mamtani, Sumit, et al.
Published: (2025) -
Learning to Decode Collaboratively with Multiple Language Models
by: Shen, Shannon Zejiang, et al.
Published: (2024) -
CS-Sum: A Benchmark for Code-Switching Dialogue Summarization and the Limits of Large Language Models
by: Suresh, Sathya Krishnan, et al.
Published: (2025)