:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Raab, Reilly, Parker, Mike, Nally, Dan, Montgomery, Sadie, Bernat, Anastasia, Munikoti, Sai, Horawalavithana, Sameera
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2507.08109
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Benchmarking LLMs for Environmental Review and Permitting
by: Meyur, Rounak, et al.
Published: (2024)

Surprisingly Fragile: Assessing and Addressing Prompt Instability in Multimodal Foundation Models
by: Stewart, Ian, et al.
Published: (2024)

Back to the Barn with LLAMAs: Evolving Pretrained LLM Backbones in Finetuning Vision Language Models
by: Horawalavithana, Sameera, et al.
Published: (2026)

SCITUNE: Aligning Large Language Models with Human-Curated Scientific Multimodal Instructions
by: Horawalavithana, Sameera, et al.
Published: (2023)

MULTISEISMO: A Multimodal Seismic Dataset and Model for Cross-Modal Seismic Understanding
by: Munikoti, Sai, et al.
Published: (2026)

Generalist Multimodal AI: A Review of Architectures, Challenges and Opportunities
by: Munikoti, Sai, et al.
Published: (2024)

WeQA: A Benchmark for Retrieval Augmented Generation in Wind Energy Domain
by: Meyur, Rounak, et al.
Published: (2024)

Evaluating the Robustness of Dense Retrievers in Interdisciplinary Domains
by: Chaturvedi, Sarthak, et al.
Published: (2025)

Reward Design for Physical Reasoning in Vision-Language Models
by: Lilienthal, Derek, et al.
Published: (2026)

Directional Concentration Uncertainty: A representational approach to uncertainty quantification for generative models
by: Chattopadhyay, Souradeep, et al.
Published: (2026)

Xwin-LM: Strong and Scalable Alignment Practice for LLMs
by: Ni, Bolin, et al.
Published: (2024)

Is On-Policy Data always the Best Choice for Direct Preference Optimization-based LM Alignment?
by: Sun, Zetian, et al.
Published: (2025)

AuditBench: Evaluating Alignment Auditing Techniques on Models with Hidden Behaviors
by: Sheshadri, Abhay, et al.
Published: (2026)

How does a Multilingual LM Handle Multiple Languages?
by: Kakarla, Santhosh, et al.
Published: (2025)

SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment
by: Liu, Qin, et al.
Published: (2024)

HumanLM: Simulating Users with State Alignment Beats Response Imitation
by: Wu, Shirley, et al.
Published: (2026)

DarwinLM: Evolutionary Structured Pruning of Large Language Models
by: Tang, Shengkun, et al.
Published: (2025)

Computational Job Market Analysis with Natural Language Processing
by: Zhang, Mike
Published: (2024)

Beyond Public Access in LLM Pre-Training Data
by: Rosenblat, Sruly, et al.
Published: (2025)

Evaluating Memory Condensation Strategies for Coding Agents in Data-Driven Scientific Discovery
by: Chintalapati, Renuka, et al.
Published: (2026)

BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM
by: Shen, Zhewen, et al.
Published: (2024)

Reseña de "The People's Property?: Power, Politics, and the Public" de Lynn Staeheli y Donald Mitchell
by: Ignasi Bernat
Published: (2012)

How do languages influence each other? Studying cross-lingual data sharing during LM fine-tuning
by: Choenni, Rochelle, et al.
Published: (2023)

Diverse Preference Learning for Capabilities and Alignment
by: Slocum, Stewart, et al.
Published: (2025)

Replicating ReLM Results: Validating Large Language Models with ReLM
by: Adamson, Reece, et al.
Published: (2025)

BabyLM Turns 3: Call for papers for the 2025 BabyLM workshop
by: Charpentier, Lucas, et al.
Published: (2025)

Identifying the Risks of LM Agents with an LM-Emulated Sandbox
by: Ruan, Yangjun, et al.
Published: (2023)

TruthTorchLM: A Comprehensive Library for Predicting Truthfulness in LLM Outputs
by: Yaldiz, Duygu Nur, et al.
Published: (2025)

There is more to graphs than meets the eye: Learning universal features with self-supervision
by: Das, Laya, et al.
Published: (2023)

PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization
by: Wang, Yidong, et al.
Published: (2023)

Arabic Stable LM: Adapting Stable LM 2 1.6B to Arabic
by: Alyafeai, Zaid, et al.
Published: (2024)

The Power of Summary-Source Alignments
by: Ernst, Ori, et al.
Published: (2024)

Generative User-Experience Research for Developing Domain-specific Natural Language Processing Applications
by: Zhukova, Anastasia, et al.
Published: (2023)

ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Models
by: Qian, Kaizhi, et al.
Published: (2025)

ContextLeak: Auditing Leakage in Private In-Context Learning Methods
by: Choi, Jacob, et al.
Published: (2025)

Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on Large Language Models
by: Kurtic, Eldar, et al.
Published: (2024)

LokiLM: Technical Report
by: Kiefel, Justin, et al.
Published: (2024)

BabyLM Turns 4 and Goes Multilingual: Call for Papers for the 2026 BabyLM Workshop
by: Choshen, Leshem, et al.
Published: (2026)

SemanticShield: LLM-Powered Audits Expose Shilling Attacks in Recommender Systems
by: Li, Kaihong, et al.
Published: (2025)

ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding
by: Paul, Indraneil, et al.
Published: (2025)