:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yang, Yanlai, Jones, Matt, Mozer, Michael C., Ren, Mengye
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Computation and Language
Online Access:	https://arxiv.org/abs/2403.09613
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Memory Storyboard: Leveraging Temporal Segmentation for Streaming Self-Supervised Learning from Egocentric Videos
by: Yang, Yanlai, et al.
Published: (2025)

LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videos
by: Wang, Ying, et al.
Published: (2023)

Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle
by: Dai, Hui, et al.
Published: (2024)

Context Tuning for In-Context Optimization
by: Lu, Jack, et al.
Published: (2025)

Learning and Forgetting Unsafe Examples in Large Language Models
by: Zhao, Jiachen, et al.
Published: (2023)

Aligning LLMs with Human Uncertainty: A Beta-Bernoulli Calibrator for LLM Forecasting
by: Dai, Hui, et al.
Published: (2026)

Anticipatory Evaluation of Language Models
by: Park, Jungsoo, et al.
Published: (2025)

Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings
by: Gopalakrishnan, Anand, et al.
Published: (2025)

Seeking the Unfamiliar but Memorable: Conceptual Creativity as Meta-Learning
by: Ren, Mengye
Published: (2026)

Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents
by: Kim, Kangsan, et al.
Published: (2026)

A General Framework for Inference-time Scaling and Steering of Diffusion Models
by: Singhal, Raghav, et al.
Published: (2025)

Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics
by: Hoang, Christopher, et al.
Published: (2025)

LURE: Latent Space Unblocking for Multi-Concept Reawakening in Diffusion Models
by: Sun, Mengyu, et al.
Published: (2026)

Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?
by: Nunez, Jeanmely Rojas, et al.
Published: (2026)

AKReF: An argumentative knowledge representation framework for structured argumentation
by: Bhattacharjee, Debarati, et al.
Published: (2025)

Using Pre-trained LLMs for Multivariate Time Series Forecasting
by: Wolff, Malcolm L., et al.
Published: (2025)

KGLink: A column type annotation method that combines knowledge graph and pre-trained language model
by: Wang, Yubo, et al.
Published: (2024)

Pretraining with hierarchical memories: separating long-tail and common knowledge
by: Pouransari, Hadi, et al.
Published: (2025)

Thinking Augmented Pre-training
by: Wang, Liang, et al.
Published: (2025)

Transferable Post-training via Inverse Value Learning
by: Lu, Xinyu, et al.
Published: (2024)

Learning without training: The implicit dynamics of in-context learning
by: Dherin, Benoit, et al.
Published: (2025)

Mapping Technological Futures: Anticipatory Discourse Through Text Mining
by: Skorski, Maciej, et al.
Published: (2025)

Bootstrapping Post-training Signals for Open-ended Tasks via Rubric-based Self-play on Pre-training Text
by: Huang, Chengyu, et al.
Published: (2026)

Variance Control via Weight Rescaling in LLM Pre-training
by: Owen, Louis, et al.
Published: (2025)

Anticipatory Understanding of Resilient Agriculture to Climate
by: Willmes, David, et al.
Published: (2024)

Generative Pre-training for Speech with Flow Matching
by: Liu, Alexander H., et al.
Published: (2023)

Perplexity-Aware Data Scaling Law: Perplexity Landscapes Predict Performance for Continual Pre-training
by: Liu, Lei, et al.
Published: (2025)

Structural Pruning of Pre-trained Language Models via Neural Architecture Search
by: Klein, Aaron, et al.
Published: (2024)

MLKD-BERT: Multi-level Knowledge Distillation for Pre-trained Language Models
by: Zhang, Ying, et al.
Published: (2024)

Counterfactual Evaluation Reveals Hidden Capability Profiles in Clinical LLMs and Agents
by: Turk, Matt
Published: (2026)

Model Merging in Pre-training of Large Language Models
by: Li, Yunshui, et al.
Published: (2025)

Post-training for Efficient Communication via Convention Formation
by: Hua, Yilun, et al.
Published: (2025)

Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models
by: Das, Anindya Sundar, et al.
Published: (2025)

nanoLM: an Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales
by: Yao, Yiqun, et al.
Published: (2023)

Machine-assisted writing evaluation: Exploring pre-trained language models in analyzing argumentative moves
by: Qin, Wenjuan, et al.
Published: (2025)

On the effective transfer of knowledge from English to Hindi Wikipedia
by: Das, Paramita, et al.
Published: (2024)

Methods of improving LLM training stability
by: Rybakov, Oleg, et al.
Published: (2024)

CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models
by: Gu, Jiawei, et al.
Published: (2024)

Collaboratively adding new knowledge to an LLM
by: Lee, Rhui Dih, et al.
Published: (2024)

Robust LLM safeguarding via refusal feature adversarial training
by: Yu, Lei, et al.
Published: (2024)