Saved in:
| Main Authors: | Gupta, Aakash, Das, Nataraj |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.13491 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Compliance-Scored Best-of-N Guardrail Orchestration for Multimodal Document Generation in Payments Dispute Defense
by: Sundar, Nataraj Agaram, et al.
Published: (2026)
by: Sundar, Nataraj Agaram, et al.
Published: (2026)
Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation
by: Burns, Thomas F, et al.
Published: (2025)
by: Burns, Thomas F, et al.
Published: (2025)
ReDepress: A Cognitive Framework for Detecting Depression Relapse from Social Media
by: Agarwal, Aakash Kumar, et al.
Published: (2025)
by: Agarwal, Aakash Kumar, et al.
Published: (2025)
RevOrder: A Novel Method for Enhanced Arithmetic in Language Models
by: Shen, Si, et al.
Published: (2024)
by: Shen, Si, et al.
Published: (2024)
A Survey on Large Language Model-empowered Autonomous Driving
by: Zhu, Yuxuan, et al.
Published: (2024)
by: Zhu, Yuxuan, et al.
Published: (2024)
Benchmarking pre-trained text embedding models in aligning built asset information
by: Shahinmoghadam, Mehrzad, et al.
Published: (2024)
by: Shahinmoghadam, Mehrzad, et al.
Published: (2024)
CrisisKAN: Knowledge-infused and Explainable Multimodal Attention Network for Crisis Event Classification
by: Gupta, Shubham, et al.
Published: (2024)
by: Gupta, Shubham, et al.
Published: (2024)
TLDR at SemEval-2024 Task 2: T5-generated clinical-Language summaries for DeBERTa Report Analysis
by: Das, Spandan, et al.
Published: (2024)
by: Das, Spandan, et al.
Published: (2024)
LoRA on the Go: Instance-level Dynamic LoRA Selection and Merging
by: Lee, Seungeon, et al.
Published: (2025)
by: Lee, Seungeon, et al.
Published: (2025)
Simple and Scalable Strategies to Continually Pre-train Large Language Models
by: Ibrahim, Adam, et al.
Published: (2024)
by: Ibrahim, Adam, et al.
Published: (2024)
Auto-Cypher: Improving LLMs on Cypher generation via LLM-supervised generation-verification framework
by: Tiwari, Aman, et al.
Published: (2024)
by: Tiwari, Aman, et al.
Published: (2024)
Continual Pre-training of MoEs: How robust is your router?
by: Thérien, Benjamin, et al.
Published: (2025)
by: Thérien, Benjamin, et al.
Published: (2025)
Automatic Classification of User Requirements from Online Feedback -- A Replication Study
by: Bhatt, Meet, et al.
Published: (2025)
by: Bhatt, Meet, et al.
Published: (2025)
Self-Improving Pretraining: using post-trained models to pretrain better models
by: Tan, Ellen Xiaoqing, et al.
Published: (2026)
by: Tan, Ellen Xiaoqing, et al.
Published: (2026)
Generation and De-Identification of Indian Clinical Discharge Summaries using LLMs
by: Singh, Sanjeet, et al.
Published: (2024)
by: Singh, Sanjeet, et al.
Published: (2024)
Improving Self-supervised Pre-training using Accent-Specific Codebooks
by: Prabhu, Darshan, et al.
Published: (2024)
by: Prabhu, Darshan, et al.
Published: (2024)
Hierarchical temporal receptive windows and zero-shot timescale generalization in biologically constrained scale-invariant deep networks
by: Sarkar, Aakash, et al.
Published: (2026)
by: Sarkar, Aakash, et al.
Published: (2026)
Towards the Dynamics of a DNN Learning Symbolic Interactions
by: Ren, Qihan, et al.
Published: (2024)
by: Ren, Qihan, et al.
Published: (2024)
Zero-shot data citation function classification using transformer-based large language models (LLMs)
by: Byers, Neil, et al.
Published: (2025)
by: Byers, Neil, et al.
Published: (2025)
TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training
by: Liang, Wanchao, et al.
Published: (2024)
by: Liang, Wanchao, et al.
Published: (2024)
Comprehensive Modeling and Question Answering of Cancer Clinical Practice Guidelines using LLMs
by: Gupta, Bhumika, et al.
Published: (2025)
by: Gupta, Bhumika, et al.
Published: (2025)
An explainable transformer circuit for compositional generalization
by: Tang, Cheng, et al.
Published: (2025)
by: Tang, Cheng, et al.
Published: (2025)
Technical Report: Quantifying and Analyzing the Generalization Power of a DNN
by: He, Yuxuan, et al.
Published: (2025)
by: He, Yuxuan, et al.
Published: (2025)
Revisiting Generalization Power of a DNN in Terms of Symbolic Interactions
by: Cheng, Lei, et al.
Published: (2025)
by: Cheng, Lei, et al.
Published: (2025)
Grokking in Linear Models for Logistic Regression
by: Das, Nataraj, et al.
Published: (2026)
by: Das, Nataraj, et al.
Published: (2026)
A Comprehensive Evaluation framework of Alignment Techniques for LLMs
by: Azmat, Muneeza, et al.
Published: (2025)
by: Azmat, Muneeza, et al.
Published: (2025)
When can transformers reason with abstract symbols?
by: Boix-Adsera, Enric, et al.
Published: (2023)
by: Boix-Adsera, Enric, et al.
Published: (2023)
Early Linguistic Pattern of Anxiety from Social Media Using Interpretable Linguistic Features: A Multi-Faceted Validation Study with Author-Disjoint Evaluation
by: Utsa, Arnab Das
Published: (2026)
by: Utsa, Arnab Das
Published: (2026)
LayerIF: Estimating Layer Quality for Large Language Models using Influence Functions
by: Askari, Hadi, et al.
Published: (2025)
by: Askari, Hadi, et al.
Published: (2025)
Prioritized Replay for RL Post-training
by: Fatemi, Mehdi
Published: (2026)
by: Fatemi, Mehdi
Published: (2026)
An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes using large language models
by: Shyr, Cathy, et al.
Published: (2026)
by: Shyr, Cathy, et al.
Published: (2026)
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
by: Berglund, Lukas, et al.
Published: (2023)
by: Berglund, Lukas, et al.
Published: (2023)
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
by: Gupta, Vipul, et al.
Published: (2024)
by: Gupta, Vipul, et al.
Published: (2024)
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
by: Srinivas, Sakhinana Sagar, et al.
Published: (2025)
by: Srinivas, Sakhinana Sagar, et al.
Published: (2025)
Automatic Pair Construction for Contrastive Post-training
by: Xu, Canwen, et al.
Published: (2023)
by: Xu, Canwen, et al.
Published: (2023)
Tiny-Toxic-Detector: A compact transformer-based model for toxic content detection
by: Kamphuis, Michiel
Published: (2024)
by: Kamphuis, Michiel
Published: (2024)
Aurora-M: Open Source Continual Pre-training for Multilingual Language and Code
by: Nakamura, Taishi, et al.
Published: (2024)
by: Nakamura, Taishi, et al.
Published: (2024)
SUS backprop: linear backpropagation algorithm for long inputs in transformers
by: Pankov, Sergey, et al.
Published: (2025)
by: Pankov, Sergey, et al.
Published: (2025)
PoTPTQ: A Two-step Power-of-Two Post-training for LLMs
by: Wang, Xinyu, et al.
Published: (2025)
by: Wang, Xinyu, et al.
Published: (2025)
A Unified Framework for Model Editing
by: Gupta, Akshat, et al.
Published: (2024)
by: Gupta, Akshat, et al.
Published: (2024)
Similar Items
-
Compliance-Scored Best-of-N Guardrail Orchestration for Multimodal Document Generation in Payments Dispute Defense
by: Sundar, Nataraj Agaram, et al.
Published: (2026) -
Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation
by: Burns, Thomas F, et al.
Published: (2025) -
ReDepress: A Cognitive Framework for Detecting Depression Relapse from Social Media
by: Agarwal, Aakash Kumar, et al.
Published: (2025) -
RevOrder: A Novel Method for Enhanced Arithmetic in Language Models
by: Shen, Si, et al.
Published: (2024) -
A Survey on Large Language Model-empowered Autonomous Driving
by: Zhu, Yuxuan, et al.
Published: (2024)