Saved in:
| Main Authors: | Raab, Reilly, Parker, Mike, Nally, Dan, Montgomery, Sadie, Bernat, Anastasia, Munikoti, Sai, Horawalavithana, Sameera |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.08109 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Benchmarking LLMs for Environmental Review and Permitting
by: Meyur, Rounak, et al.
Published: (2024)
by: Meyur, Rounak, et al.
Published: (2024)
Surprisingly Fragile: Assessing and Addressing Prompt Instability in Multimodal Foundation Models
by: Stewart, Ian, et al.
Published: (2024)
by: Stewart, Ian, et al.
Published: (2024)
Back to the Barn with LLAMAs: Evolving Pretrained LLM Backbones in Finetuning Vision Language Models
by: Horawalavithana, Sameera, et al.
Published: (2026)
by: Horawalavithana, Sameera, et al.
Published: (2026)
SCITUNE: Aligning Large Language Models with Human-Curated Scientific Multimodal Instructions
by: Horawalavithana, Sameera, et al.
Published: (2023)
by: Horawalavithana, Sameera, et al.
Published: (2023)
MULTISEISMO: A Multimodal Seismic Dataset and Model for Cross-Modal Seismic Understanding
by: Munikoti, Sai, et al.
Published: (2026)
by: Munikoti, Sai, et al.
Published: (2026)
Generalist Multimodal AI: A Review of Architectures, Challenges and Opportunities
by: Munikoti, Sai, et al.
Published: (2024)
by: Munikoti, Sai, et al.
Published: (2024)
WeQA: A Benchmark for Retrieval Augmented Generation in Wind Energy Domain
by: Meyur, Rounak, et al.
Published: (2024)
by: Meyur, Rounak, et al.
Published: (2024)
Evaluating the Robustness of Dense Retrievers in Interdisciplinary Domains
by: Chaturvedi, Sarthak, et al.
Published: (2025)
by: Chaturvedi, Sarthak, et al.
Published: (2025)
Reward Design for Physical Reasoning in Vision-Language Models
by: Lilienthal, Derek, et al.
Published: (2026)
by: Lilienthal, Derek, et al.
Published: (2026)
Directional Concentration Uncertainty: A representational approach to uncertainty quantification for generative models
by: Chattopadhyay, Souradeep, et al.
Published: (2026)
by: Chattopadhyay, Souradeep, et al.
Published: (2026)
Xwin-LM: Strong and Scalable Alignment Practice for LLMs
by: Ni, Bolin, et al.
Published: (2024)
by: Ni, Bolin, et al.
Published: (2024)
Is On-Policy Data always the Best Choice for Direct Preference Optimization-based LM Alignment?
by: Sun, Zetian, et al.
Published: (2025)
by: Sun, Zetian, et al.
Published: (2025)
AuditBench: Evaluating Alignment Auditing Techniques on Models with Hidden Behaviors
by: Sheshadri, Abhay, et al.
Published: (2026)
by: Sheshadri, Abhay, et al.
Published: (2026)
How does a Multilingual LM Handle Multiple Languages?
by: Kakarla, Santhosh, et al.
Published: (2025)
by: Kakarla, Santhosh, et al.
Published: (2025)
SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment
by: Liu, Qin, et al.
Published: (2024)
by: Liu, Qin, et al.
Published: (2024)
HumanLM: Simulating Users with State Alignment Beats Response Imitation
by: Wu, Shirley, et al.
Published: (2026)
by: Wu, Shirley, et al.
Published: (2026)
DarwinLM: Evolutionary Structured Pruning of Large Language Models
by: Tang, Shengkun, et al.
Published: (2025)
by: Tang, Shengkun, et al.
Published: (2025)
Computational Job Market Analysis with Natural Language Processing
by: Zhang, Mike
Published: (2024)
by: Zhang, Mike
Published: (2024)
Beyond Public Access in LLM Pre-Training Data
by: Rosenblat, Sruly, et al.
Published: (2025)
by: Rosenblat, Sruly, et al.
Published: (2025)
Evaluating Memory Condensation Strategies for Coding Agents in Data-Driven Scientific Discovery
by: Chintalapati, Renuka, et al.
Published: (2026)
by: Chintalapati, Renuka, et al.
Published: (2026)
BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM
by: Shen, Zhewen, et al.
Published: (2024)
by: Shen, Zhewen, et al.
Published: (2024)
Reseña de "The People's Property?: Power, Politics, and the Public" de Lynn Staeheli y Donald Mitchell
by: Ignasi Bernat
Published: (2012)
by: Ignasi Bernat
Published: (2012)
How do languages influence each other? Studying cross-lingual data sharing during LM fine-tuning
by: Choenni, Rochelle, et al.
Published: (2023)
by: Choenni, Rochelle, et al.
Published: (2023)
Diverse Preference Learning for Capabilities and Alignment
by: Slocum, Stewart, et al.
Published: (2025)
by: Slocum, Stewart, et al.
Published: (2025)
Replicating ReLM Results: Validating Large Language Models with ReLM
by: Adamson, Reece, et al.
Published: (2025)
by: Adamson, Reece, et al.
Published: (2025)
BabyLM Turns 3: Call for papers for the 2025 BabyLM workshop
by: Charpentier, Lucas, et al.
Published: (2025)
by: Charpentier, Lucas, et al.
Published: (2025)
Identifying the Risks of LM Agents with an LM-Emulated Sandbox
by: Ruan, Yangjun, et al.
Published: (2023)
by: Ruan, Yangjun, et al.
Published: (2023)
TruthTorchLM: A Comprehensive Library for Predicting Truthfulness in LLM Outputs
by: Yaldiz, Duygu Nur, et al.
Published: (2025)
by: Yaldiz, Duygu Nur, et al.
Published: (2025)
There is more to graphs than meets the eye: Learning universal features with self-supervision
by: Das, Laya, et al.
Published: (2023)
by: Das, Laya, et al.
Published: (2023)
PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization
by: Wang, Yidong, et al.
Published: (2023)
by: Wang, Yidong, et al.
Published: (2023)
Arabic Stable LM: Adapting Stable LM 2 1.6B to Arabic
by: Alyafeai, Zaid, et al.
Published: (2024)
by: Alyafeai, Zaid, et al.
Published: (2024)
The Power of Summary-Source Alignments
by: Ernst, Ori, et al.
Published: (2024)
by: Ernst, Ori, et al.
Published: (2024)
Generative User-Experience Research for Developing Domain-specific Natural Language Processing Applications
by: Zhukova, Anastasia, et al.
Published: (2023)
by: Zhukova, Anastasia, et al.
Published: (2023)
ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Models
by: Qian, Kaizhi, et al.
Published: (2025)
by: Qian, Kaizhi, et al.
Published: (2025)
ContextLeak: Auditing Leakage in Private In-Context Learning Methods
by: Choi, Jacob, et al.
Published: (2025)
by: Choi, Jacob, et al.
Published: (2025)
Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on Large Language Models
by: Kurtic, Eldar, et al.
Published: (2024)
by: Kurtic, Eldar, et al.
Published: (2024)
LokiLM: Technical Report
by: Kiefel, Justin, et al.
Published: (2024)
by: Kiefel, Justin, et al.
Published: (2024)
BabyLM Turns 4 and Goes Multilingual: Call for Papers for the 2026 BabyLM Workshop
by: Choshen, Leshem, et al.
Published: (2026)
by: Choshen, Leshem, et al.
Published: (2026)
SemanticShield: LLM-Powered Audits Expose Shilling Attacks in Recommender Systems
by: Li, Kaihong, et al.
Published: (2025)
by: Li, Kaihong, et al.
Published: (2025)
ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding
by: Paul, Indraneil, et al.
Published: (2025)
by: Paul, Indraneil, et al.
Published: (2025)
Similar Items
-
Benchmarking LLMs for Environmental Review and Permitting
by: Meyur, Rounak, et al.
Published: (2024) -
Surprisingly Fragile: Assessing and Addressing Prompt Instability in Multimodal Foundation Models
by: Stewart, Ian, et al.
Published: (2024) -
Back to the Barn with LLAMAs: Evolving Pretrained LLM Backbones in Finetuning Vision Language Models
by: Horawalavithana, Sameera, et al.
Published: (2026) -
SCITUNE: Aligning Large Language Models with Human-Curated Scientific Multimodal Instructions
by: Horawalavithana, Sameera, et al.
Published: (2023) -
MULTISEISMO: A Multimodal Seismic Dataset and Model for Cross-Modal Seismic Understanding
by: Munikoti, Sai, et al.
Published: (2026)