:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yam, Hong Meng, Paek, Nathan J
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2411.06672
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Mini Minds: Exploring Bebeshka and Zlata Baby Models
by: Proskurina, Irina, et al.
Published: (2023)

Constrained Sampling for Language Models Should Be Easy: An MCMC Perspective
by: Gonzalez, Emmanuel Anaya, et al.
Published: (2025)

BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
by: Wang, Shengao, et al.
Published: (2025)

Should LLMs be WEIRD? Exploring WEIRDness and Human Rights in Large Language Models
by: Zhou, Ke, et al.
Published: (2025)

Should You Use Your Large Language Model to Explore or Exploit?
by: Harris, Keegan, et al.
Published: (2025)

When Babies Teach Babies: Can student knowledge sharing outperform Teacher-Guided Distillation on small datasets?
by: Iyer, Srikrishna
Published: (2024)

Baby Scale: Investigating Models Trained on Individual Children's Language Input
by: Feng, Steven Y., et al.
Published: (2026)

Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning
by: Zhao, Jun, et al.
Published: (2024)

Bias Dynamics in BabyLMs: Towards a Compute-Efficient Sandbox for Democratising Pre-Training Debiasing
by: Trhlik, Filip, et al.
Published: (2026)

Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling
by: Aynetdinov, Ansar, et al.
Published: (2026)

Learning to Plan for Language Modeling from Unlabeled Data
by: Cornille, Nathan, et al.
Published: (2024)

Thinking in Many Modes: How Composite Reasoning Elevates Large Language Model Performance with Limited Data
by: Ahmad, Zishan, et al.
Published: (2025)

Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs
by: Smit, Andries, et al.
Published: (2023)

Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models
by: Karov, Bar, et al.
Published: (2025)

Exploring Data-Efficient Adaptation of Large Language Models for Code Generation
by: Jiang, Xue, et al.
Published: (2024)

Models Can and Should Embrace the Communicative Nature of Human-Generated Math
by: Boguraev, Sasha, et al.
Published: (2024)

PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling
by: Xie, Xudong, et al.
Published: (2024)

Preference Curriculum: LLMs Should Always Be Pretrained on Their Preferred Data
by: Zhang, Xuemiao, et al.
Published: (2025)

Balanced Data Sampling for Language Model Training with Clustering
by: Shao, Yunfan, et al.
Published: (2024)

You Are What You Train: Effects of Data Composition on Training Context-aware Machine Translation Models
by: Mąka, Paweł, et al.
Published: (2025)

What are Models Thinking about? Understanding Large Language Model Hallucinations "Psychology" through Model Inner State Analysis
by: Wang, Peiran, et al.
Published: (2025)

Sample-Efficient Language Modeling with Linear Attention and Lightweight Enhancements
by: Haller, Patrick, et al.
Published: (2025)

Dense X Retrieval: What Retrieval Granularity Should We Use?
by: Chen, Tong, et al.
Published: (2023)

Express Your Doubts -- Probabilistic World Modeling Should not be Based on Token logprobs
by: Wagner, Eitan, et al.
Published: (2025)

WebDS: An End-to-End Benchmark for Web-based Data Science
by: Hsu, Ethan, et al.
Published: (2025)

Exploring the Performance of Large Language Models on Subjective Span Identification Tasks
by: Dmonte, Alphaeus, et al.
Published: (2026)

What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions
by: Zhang, Liyi, et al.
Published: (2024)

Target-Aware Language Modeling via Granular Data Sampling
by: Chang, Ernie, et al.
Published: (2024)

Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs?
by: Hase, Peter, et al.
Published: (2024)

BabyReasoningBench: Generating Developmentally-Inspired Reasoning Tasks for Evaluating Baby Language Models
by: Dhole, Kaustubh D.
Published: (2026)

Exploring Data and Parameter Efficient Strategies for Arabic Dialect Identifications
by: Kanjirangat, Vani, et al.
Published: (2025)

Beyond Random Sampling: Efficient Language Model Pretraining via Curriculum Learning
by: Zhang, Yang, et al.
Published: (2025)

Cannot or Should Not? Automatic Analysis of Refusal Composition in IFT/RLHF Datasets and Refusal Behavior of Black-Box LLMs
by: von Recum, Alexander, et al.
Published: (2024)

Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement
by: Kong, Injin, et al.
Published: (2026)

PreCog: Exploring the Relation between Memorization and Performance in Pre-trained Language Models
by: Ranaldi, Leonardo, et al.
Published: (2023)

Bringing Up a Bilingual BabyLM: Investigating Multilingual Language Acquisition Using Small-Scale Models
by: Zeng, Linda, et al.
Published: (2026)

BabyHGRN: Exploring RNNs for Sample-Efficient Training of Language Models
by: Haller, Patrick, et al.
Published: (2024)

When Should Models Change Their Minds? Contextual Belief Management in Large Language Models
by: Xu, Haoming, et al.
Published: (2026)

The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
by: Tang, Zhenheng, et al.
Published: (2025)

Where Should I Study? Biased Language Models Decide! Evaluating Fairness in LMs for Academic Recommendations
by: Shailya, Krithi, et al.
Published: (2025)