Saved in:
| Main Authors: | Yang, Yanlai, Jones, Matt, Mozer, Michael C., Ren, Mengye |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.09613 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Memory Storyboard: Leveraging Temporal Segmentation for Streaming Self-Supervised Learning from Egocentric Videos
by: Yang, Yanlai, et al.
Published: (2025)
by: Yang, Yanlai, et al.
Published: (2025)
LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videos
by: Wang, Ying, et al.
Published: (2023)
by: Wang, Ying, et al.
Published: (2023)
Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle
by: Dai, Hui, et al.
Published: (2024)
by: Dai, Hui, et al.
Published: (2024)
Context Tuning for In-Context Optimization
by: Lu, Jack, et al.
Published: (2025)
by: Lu, Jack, et al.
Published: (2025)
Learning and Forgetting Unsafe Examples in Large Language Models
by: Zhao, Jiachen, et al.
Published: (2023)
by: Zhao, Jiachen, et al.
Published: (2023)
Aligning LLMs with Human Uncertainty: A Beta-Bernoulli Calibrator for LLM Forecasting
by: Dai, Hui, et al.
Published: (2026)
by: Dai, Hui, et al.
Published: (2026)
Anticipatory Evaluation of Language Models
by: Park, Jungsoo, et al.
Published: (2025)
by: Park, Jungsoo, et al.
Published: (2025)
Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings
by: Gopalakrishnan, Anand, et al.
Published: (2025)
by: Gopalakrishnan, Anand, et al.
Published: (2025)
Seeking the Unfamiliar but Memorable: Conceptual Creativity as Meta-Learning
by: Ren, Mengye
Published: (2026)
by: Ren, Mengye
Published: (2026)
Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents
by: Kim, Kangsan, et al.
Published: (2026)
by: Kim, Kangsan, et al.
Published: (2026)
A General Framework for Inference-time Scaling and Steering of Diffusion Models
by: Singhal, Raghav, et al.
Published: (2025)
by: Singhal, Raghav, et al.
Published: (2025)
Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics
by: Hoang, Christopher, et al.
Published: (2025)
by: Hoang, Christopher, et al.
Published: (2025)
LURE: Latent Space Unblocking for Multi-Concept Reawakening in Diffusion Models
by: Sun, Mengyu, et al.
Published: (2026)
by: Sun, Mengyu, et al.
Published: (2026)
Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?
by: Nunez, Jeanmely Rojas, et al.
Published: (2026)
by: Nunez, Jeanmely Rojas, et al.
Published: (2026)
AKReF: An argumentative knowledge representation framework for structured argumentation
by: Bhattacharjee, Debarati, et al.
Published: (2025)
by: Bhattacharjee, Debarati, et al.
Published: (2025)
Using Pre-trained LLMs for Multivariate Time Series Forecasting
by: Wolff, Malcolm L., et al.
Published: (2025)
by: Wolff, Malcolm L., et al.
Published: (2025)
KGLink: A column type annotation method that combines knowledge graph and pre-trained language model
by: Wang, Yubo, et al.
Published: (2024)
by: Wang, Yubo, et al.
Published: (2024)
Pretraining with hierarchical memories: separating long-tail and common knowledge
by: Pouransari, Hadi, et al.
Published: (2025)
by: Pouransari, Hadi, et al.
Published: (2025)
Thinking Augmented Pre-training
by: Wang, Liang, et al.
Published: (2025)
by: Wang, Liang, et al.
Published: (2025)
Transferable Post-training via Inverse Value Learning
by: Lu, Xinyu, et al.
Published: (2024)
by: Lu, Xinyu, et al.
Published: (2024)
Learning without training: The implicit dynamics of in-context learning
by: Dherin, Benoit, et al.
Published: (2025)
by: Dherin, Benoit, et al.
Published: (2025)
Mapping Technological Futures: Anticipatory Discourse Through Text Mining
by: Skorski, Maciej, et al.
Published: (2025)
by: Skorski, Maciej, et al.
Published: (2025)
Bootstrapping Post-training Signals for Open-ended Tasks via Rubric-based Self-play on Pre-training Text
by: Huang, Chengyu, et al.
Published: (2026)
by: Huang, Chengyu, et al.
Published: (2026)
Variance Control via Weight Rescaling in LLM Pre-training
by: Owen, Louis, et al.
Published: (2025)
by: Owen, Louis, et al.
Published: (2025)
Anticipatory Understanding of Resilient Agriculture to Climate
by: Willmes, David, et al.
Published: (2024)
by: Willmes, David, et al.
Published: (2024)
Generative Pre-training for Speech with Flow Matching
by: Liu, Alexander H., et al.
Published: (2023)
by: Liu, Alexander H., et al.
Published: (2023)
Perplexity-Aware Data Scaling Law: Perplexity Landscapes Predict Performance for Continual Pre-training
by: Liu, Lei, et al.
Published: (2025)
by: Liu, Lei, et al.
Published: (2025)
Structural Pruning of Pre-trained Language Models via Neural Architecture Search
by: Klein, Aaron, et al.
Published: (2024)
by: Klein, Aaron, et al.
Published: (2024)
MLKD-BERT: Multi-level Knowledge Distillation for Pre-trained Language Models
by: Zhang, Ying, et al.
Published: (2024)
by: Zhang, Ying, et al.
Published: (2024)
Counterfactual Evaluation Reveals Hidden Capability Profiles in Clinical LLMs and Agents
by: Turk, Matt
Published: (2026)
by: Turk, Matt
Published: (2026)
Model Merging in Pre-training of Large Language Models
by: Li, Yunshui, et al.
Published: (2025)
by: Li, Yunshui, et al.
Published: (2025)
Post-training for Efficient Communication via Convention Formation
by: Hua, Yilun, et al.
Published: (2025)
by: Hua, Yilun, et al.
Published: (2025)
Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models
by: Das, Anindya Sundar, et al.
Published: (2025)
by: Das, Anindya Sundar, et al.
Published: (2025)
nanoLM: an Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales
by: Yao, Yiqun, et al.
Published: (2023)
by: Yao, Yiqun, et al.
Published: (2023)
Machine-assisted writing evaluation: Exploring pre-trained language models in analyzing argumentative moves
by: Qin, Wenjuan, et al.
Published: (2025)
by: Qin, Wenjuan, et al.
Published: (2025)
On the effective transfer of knowledge from English to Hindi Wikipedia
by: Das, Paramita, et al.
Published: (2024)
by: Das, Paramita, et al.
Published: (2024)
Methods of improving LLM training stability
by: Rybakov, Oleg, et al.
Published: (2024)
by: Rybakov, Oleg, et al.
Published: (2024)
CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models
by: Gu, Jiawei, et al.
Published: (2024)
by: Gu, Jiawei, et al.
Published: (2024)
Collaboratively adding new knowledge to an LLM
by: Lee, Rhui Dih, et al.
Published: (2024)
by: Lee, Rhui Dih, et al.
Published: (2024)
Robust LLM safeguarding via refusal feature adversarial training
by: Yu, Lei, et al.
Published: (2024)
by: Yu, Lei, et al.
Published: (2024)
Similar Items
-
Memory Storyboard: Leveraging Temporal Segmentation for Streaming Self-Supervised Learning from Egocentric Videos
by: Yang, Yanlai, et al.
Published: (2025) -
LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videos
by: Wang, Ying, et al.
Published: (2023) -
Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle
by: Dai, Hui, et al.
Published: (2024) -
Context Tuning for In-Context Optimization
by: Lu, Jack, et al.
Published: (2025) -
Learning and Forgetting Unsafe Examples in Large Language Models
by: Zhao, Jiachen, et al.
Published: (2023)