:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Iyer, Srikrishna
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2411.16487
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Mini Minds: Exploring Bebeshka and Zlata Baby Models
by: Proskurina, Irina, et al.
Published: (2023)

BabyReasoningBench: Generating Developmentally-Inspired Reasoning Tasks for Evaluating Baby Language Models
by: Dhole, Kaustubh D.
Published: (2026)

Baby Scale: Investigating Models Trained on Individual Children's Language Input
by: Feng, Steven Y., et al.
Published: (2026)

BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
by: Wang, Shengao, et al.
Published: (2025)

What Should Baby Models Read? Exploring Sample-Efficient Data Composition on Model Performance
by: Yam, Hong Meng, et al.
Published: (2024)

Bias Dynamics in BabyLMs: Towards a Compute-Efficient Sandbox for Democratising Pre-Training Debiasing
by: Trhlik, Filip, et al.
Published: (2026)

Bringing Up a Bilingual BabyLM: Investigating Multilingual Language Acquisition Using Small-Scale Models
by: Zeng, Linda, et al.
Published: (2026)

EgoBabyVLM: Benchmarking Cross-Modal Learning from Naturalistic Egocentric Video Data
by: Lin, Dongyan, et al.
Published: (2026)

Auditing Google's AI Overviews and Featured Snippets: A Case Study on Baby Care and Pregnancy
by: Hu, Desheng, et al.
Published: (2025)

Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios
by: Zhou, Yuhang, et al.
Published: (2024)

Your Teacher Can't Help You Here: Combating Supervision Fidelity Decay in On-Policy Distillation
by: Liu, Yanjiang, et al.
Published: (2026)

Don't Kill the Baby: The Case for AI in Arbitration
by: Broyde, Michael, et al.
Published: (2024)

Multi-agent AI systems outperform human teams in creativity
by: Hu, Tiancheng, et al.
Published: (2026)

Decoding with Limited Teacher Supervision Requires Understanding When to Trust the Teacher
by: Ok, Hyunjong, et al.
Published: (2024)

BabyLlama-2: Ensemble-Distilled Models Consistently Outperform Teachers With Limited Data
by: Tastet, Jean-Loup, et al.
Published: (2024)

BabyLM Turns 3: Call for papers for the 2025 BabyLM workshop
by: Charpentier, Lucas, et al.
Published: (2025)

ASTRO: Teaching Language Models to Reason by Reflecting and Backtracking In-Context
by: Kim, Joongwon, et al.
Published: (2025)

Code-enabled language models can outperform reasoning models on diverse tasks
by: Zhang, Cedegao E., et al.
Published: (2025)

MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?
by: Che, Xinyu, et al.
Published: (2026)

On Teacher Hacking in Language Model Distillation
by: Tiapkin, Daniil, et al.
Published: (2025)

LLMs Can Teach Themselves to Better Predict the Future
by: Turtel, Benjamin, et al.
Published: (2025)

Reliability Gated Multi-Teacher Distillation for Low Resource Abstractive Summarization
by: Sumit, Dipto, et al.
Published: (2026)

BabyLM Turns 4 and Goes Multilingual: Call for Papers for the 2026 BabyLM Workshop
by: Choshen, Leshem, et al.
Published: (2026)

Can postgraduate translation students identify machine-generated text?
by: Farrell, Michael
Published: (2025)

When Perplexity Lies: Generation-Focused Distillation of Hybrid Sequence Models
by: Kostelec, Juan Gabriel, et al.
Published: (2026)

CLARity: Reasoning Consistency Alone Can Teach Reinforced Experts
by: Lin, Jiuheng, et al.
Published: (2025)

Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
by: Xu, Wenda, et al.
Published: (2024)

Can Large Models Teach Student Models to Solve Mathematical Problems Like Human Beings? A Reasoning Distillation Method via Multi-LoRA Interaction
by: Li, Xinhe, et al.
Published: (2025)

Steering LLMs? Actually, Sparse Autoencoders can outperform simple baselines
by: Jørgensen, Mikkel Godsk, et al.
Published: (2026)

Backtracking When It Strays: Mitigating Dual Exposure Biases in LLM Reasoning Distillation
by: Wang, Bing, et al.
Published: (2026)

When Can Transformers Count to n?
by: Yehudai, Gilad, et al.
Published: (2024)

Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study
by: Ning, Xuefei, et al.
Published: (2024)

Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review
by: Li, Zhuochun, et al.
Published: (2024)

Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key
by: Wang, Tianle, et al.
Published: (2026)

TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise
by: He, Nan, et al.
Published: (2023)

SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
by: Nag, Sayan, et al.
Published: (2024)

Are BabyLMs Second Language Learners?
by: Edman, Lukas, et al.
Published: (2024)

ELAD: Explanation-Guided Large Language Models Active Distillation
by: Zhang, Yifei, et al.
Published: (2024)

ASKD-Whisper: Adaptive Self-knowledge Distillation for Efficient and Low-Latency Automatic Speech Recognition
by: Lee, Junseok, et al.
Published: (2026)

Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation
by: Yang, Wenkai, et al.
Published: (2026)