Saved in:
| Main Authors: | Cappello, Franck, Madireddy, Sandeep, Underwood, Robert, Getty, Neil, Chia, Nicholas Lee-Ping, Ramachandra, Nesar, Nguyen, Josh, Keceli, Murat, Mallick, Tanwi, Li, Zilinghan, Ngom, Marieme, Zhang, Chenhui, Yanguas-Gil, Angel, Antoniuk, Evan, Kailkhura, Bhavya, Tian, Minyang, Du, Yufeng, Ting, Yuan-Sen, Wells, Azton, Nicolae, Bogdan, Maurya, Avinash, Rafique, M. Mustafa, Huerta, Eliu, Li, Bo, Foster, Ian, Stevens, Rick |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.20309 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Teaching LLMs to Speak Spectroscopy
by: Ramachandra, Nesar, et al.
Published: (2025)
by: Ramachandra, Nesar, et al.
Published: (2025)
Multi-modal Foundation Model for Cosmological Simulation Data
by: Xia, Bin, et al.
Published: (2025)
by: Xia, Bin, et al.
Published: (2025)
DataStates-LLM: Lazy Asynchronous Checkpointing for Large Language Models
by: Maurya, Avinash, et al.
Published: (2024)
by: Maurya, Avinash, et al.
Published: (2024)
Kernel Model Validation: How To Do It, And Why You Should Care
by: Graziani, Carlo, et al.
Published: (2025)
by: Graziani, Carlo, et al.
Published: (2025)
Targeted Adaptive Design
by: Graziani, Carlo, et al.
Published: (2022)
by: Graziani, Carlo, et al.
Published: (2022)
DataStates-LLM: Scalable Checkpointing for Transformer Models Using Composable State Providers
by: Maurya, Avinash, et al.
Published: (2026)
by: Maurya, Avinash, et al.
Published: (2026)
MLP-Offload: Multi-Level, Multi-Path Offloading for LLM Pre-training to Break the GPU Memory Wall
by: Maurya, Avinash, et al.
Published: (2025)
by: Maurya, Avinash, et al.
Published: (2025)
AstroMLab 1: Who Wins Astronomy Jeopardy!?
by: Ting, Yuan-Sen, et al.
Published: (2024)
by: Ting, Yuan-Sen, et al.
Published: (2024)
Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading
by: Maurya, Avinash, et al.
Published: (2024)
by: Maurya, Avinash, et al.
Published: (2024)
Breaking the Memory Wall: A Study of I/O Patterns and GPU Memory Utilization for Hybrid CPU-GPU Offloaded Optimizers
by: Maurya, Avinash, et al.
Published: (2024)
by: Maurya, Avinash, et al.
Published: (2024)
Automated MCQA Benchmarking at Scale: Evaluating Reasoning Traces as Retrieval Sources for Domain Adaptation of Small Language Models
by: Gokdemir, Ozan, et al.
Published: (2025)
by: Gokdemir, Ozan, et al.
Published: (2025)
Context Length Alone Hurts LLM Performance Despite Perfect Retrieval
by: Du, Yufeng, et al.
Published: (2025)
by: Du, Yufeng, et al.
Published: (2025)
Who Gets the Reward, Who Gets the Blame? Evaluation-Aligned Training Signals for Multi-LLM Agents
by: Yang, Chih-Hsuan, et al.
Published: (2025)
by: Yang, Chih-Hsuan, et al.
Published: (2025)
UProp: Investigating the Uncertainty Propagation of LLMs in Multi-Step Agentic Decision-Making
by: Duan, Jinhao, et al.
Published: (2025)
by: Duan, Jinhao, et al.
Published: (2025)
Uncovering Physical Drivers of Dark Matter Halo Structures with Auxiliary-Variable-Guided Generative Models
by: Ganguli, Arkaprabha, et al.
Published: (2026)
by: Ganguli, Arkaprabha, et al.
Published: (2026)
Extending $μ$P: Spectral Conditions for Feature Learning Across Optimizers
by: Gupta, Akshita, et al.
Published: (2026)
by: Gupta, Akshita, et al.
Published: (2026)
AstroMLab 3: Achieving GPT-4o Level Performance in Astronomy with a Specialized 8B-Parameter Large Language Model
by: de Haan, Tijmen, et al.
Published: (2024)
by: de Haan, Tijmen, et al.
Published: (2024)
Active Learning Enables Extrapolation in Molecular Generative Models
by: Antoniuk, Evan R., et al.
Published: (2025)
by: Antoniuk, Evan R., et al.
Published: (2025)
AstroMLab 4: Benchmark-Topping Performance in Astronomy Q&A with a 70B-Parameter Domain-Specialized Reasoning Model
by: de Haan, Tijmen, et al.
Published: (2025)
by: de Haan, Tijmen, et al.
Published: (2025)
MOSAIC: Multi-agent Orchestration for Task-Intelligent Scientific Coding
by: Raghavan, Siddeshwar, et al.
Published: (2025)
by: Raghavan, Siddeshwar, et al.
Published: (2025)
Wavelet-Inspired Multiscale Graph Convolutional Recurrent Network for Traffic Forecasting
by: Qian, Qipeng, et al.
Published: (2024)
by: Qian, Qipeng, et al.
Published: (2024)
No Test Cases, No Problem: Distillation-Driven Code Generation for Scientific Workflows
by: Raghavan, Siddeshwar, et al.
Published: (2026)
by: Raghavan, Siddeshwar, et al.
Published: (2026)
From Atomistic Models to Machine Learning: Predictive Design of Nanocarbons under Extreme Conditions
by: Yan, Xiaoli, et al.
Published: (2026)
by: Yan, Xiaoli, et al.
Published: (2026)
Benchmarking AI-evolved cosmological structure formation
by: Dong, Xiaofeng, et al.
Published: (2025)
by: Dong, Xiaofeng, et al.
Published: (2025)
Enhancing Interpretability in Generative Modeling: Statistically Disentangled Latent Spaces Guided by Generative Factors in Scientific Datasets
by: Ganguli, Arkaprabha, et al.
Published: (2025)
by: Ganguli, Arkaprabha, et al.
Published: (2025)
LUMINA: Detecting Hallucinations in RAG System with Context-Knowledge Signals
by: Yeh, Samuel, et al.
Published: (2025)
by: Yeh, Samuel, et al.
Published: (2025)
FedSZ: Leveraging Error-Bounded Lossy Compression for Federated Learning Communications
by: Wilkins, Grant, et al.
Published: (2023)
by: Wilkins, Grant, et al.
Published: (2023)
Reducing Model Error Using Optimised Galaxy Selection: Weak Lensing Cluster Mass Estimation
by: Rau, Markus Michael, et al.
Published: (2024)
by: Rau, Markus Michael, et al.
Published: (2024)
Toward Reliable, Safe, and Secure LLMs for Scientific Applications
by: Chaturvedi, Saket Sanjeev, et al.
Published: (2026)
by: Chaturvedi, Saket Sanjeev, et al.
Published: (2026)
Multi-task Modeling for Engineering Applications with Sparse Data
by: Comlek, Yigitcan, et al.
Published: (2026)
by: Comlek, Yigitcan, et al.
Published: (2026)
Evaluating the Safety and Skill Reasoning of Large Reasoning Models Under Compute Constraints
by: Balaji, Adarsha, et al.
Published: (2025)
by: Balaji, Adarsha, et al.
Published: (2025)
Secure Federated Learning Across Heterogeneous Cloud and High-Performance Computing Resources -- A Case Study on Federated Fine-tuning of LLaMA 2
by: Li, Zilinghan, et al.
Published: (2024)
by: Li, Zilinghan, et al.
Published: (2024)
Double Visual Defense: Adversarial Pre-training and Instruction Tuning for Improving Vision-Language Model Robustness
by: Wang, Zeyu, et al.
Published: (2025)
by: Wang, Zeyu, et al.
Published: (2025)
FedCluster: Boosting the Convergence of Federated Learning via Cluster-Cycling
by: Chen, Cheng, et al.
Published: (2020)
by: Chen, Cheng, et al.
Published: (2020)
Improving Robustness In Sparse Autoencoders via Masked Regularization
by: Narayanaswamy, Vivek, et al.
Published: (2026)
by: Narayanaswamy, Vivek, et al.
Published: (2026)
Certifiably-Robust Federated Adversarial Learning via Randomized Smoothing
by: Chen, Cheng, et al.
Published: (2021)
by: Chen, Cheng, et al.
Published: (2021)
Training Dynamics of Transformers to Recognize Word Co-occurrence via Gradient Flow Analysis
by: Yang, Hongru, et al.
Published: (2024)
by: Yang, Hongru, et al.
Published: (2024)
End-to-End Mesh Optimization of a Hybrid Deep Learning Black-Box PDE Solver
by: Ma, Shaocong, et al.
Published: (2024)
by: Ma, Shaocong, et al.
Published: (2024)
Comparative Evaluation of Prompting and Fine-Tuning for Applying Large Language Models to Grid-Structured Geospatial Data
by: Dhruv, Akash, et al.
Published: (2025)
by: Dhruv, Akash, et al.
Published: (2025)
ChatVis: Automating Scientific Visualization with a Large Language Model
by: Mallick, Tanwi, et al.
Published: (2024)
by: Mallick, Tanwi, et al.
Published: (2024)
Similar Items
-
Teaching LLMs to Speak Spectroscopy
by: Ramachandra, Nesar, et al.
Published: (2025) -
Multi-modal Foundation Model for Cosmological Simulation Data
by: Xia, Bin, et al.
Published: (2025) -
DataStates-LLM: Lazy Asynchronous Checkpointing for Large Language Models
by: Maurya, Avinash, et al.
Published: (2024) -
Kernel Model Validation: How To Do It, And Why You Should Care
by: Graziani, Carlo, et al.
Published: (2025) -
Targeted Adaptive Design
by: Graziani, Carlo, et al.
Published: (2022)