:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ishibashi, Yoichi, Yano, Taro, Oyamada, Masafumi
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2505.10182
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

LaMDAgent: An Autonomous Framework for Post-Training Pipeline Optimization via LLM Agents
by: Yano, Taro, et al.
Published: (2025)

Can Large Language Models Invent Algorithms to Improve Themselves?: Algorithm Discovery for Recursive Self-Improvement through Reinforcement Learning
by: Ishibashi, Yoichi, et al.
Published: (2024)

Effective Harness Engineering for Algorithm Discovery with Coding Agents
by: Ishibashi, Yoichi, et al.
Published: (2026)

An Empirical Study of LLM-as-a-Judge: How Design Choices Impact Evaluation Reliability
by: Yamauchi, Yusuke, et al.
Published: (2025)

Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance
by: Tamura, Takuya, et al.
Published: (2025)

Jellyfish: A Large Language Model for Data Preprocessing
by: Zhang, Haochen, et al.
Published: (2023)

DeLTa: A Decoding Strategy based on Logit Trajectory Prediction Improves Factuality and Reasoning Ability
by: He, Yunzhen, et al.
Published: (2025)

Self-Organized Agents: A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization
by: Ishibashi, Yoichi, et al.
Published: (2024)

Understanding Hidden Computations in Chain-of-Thought Reasoning
by: Bharadwaj, Aryasomayajula Ram
Published: (2024)

Subspace Representations for Soft Set Operations and Sentence Similarities
by: Ishibashi, Yoichi, et al.
Published: (2022)

LLM Pretraining with Continuous Concepts
by: Tack, Jihoon, et al.
Published: (2025)

Hidden Error Awareness in Chain-of-Thought Reasoning: The Signal Is Diagnostic, Not Causal
by: Yuan, Aojie, et al.
Published: (2026)

Reasoning-Driven Synthetic Data Generation and Evaluation
by: Davidson, Tim R., et al.
Published: (2026)

BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining
by: DatologyAI, et al.
Published: (2025)

Mining Intrinsic Rewards from LLM Hidden States for Efficient Best-of-N Sampling
by: Guo, Jizhou, et al.
Published: (2025)

RingSQL: Generating Synthetic Data with Schema-Independent Templates for Text-to-SQL Reasoning Models
by: Sterbentz, Marko, et al.
Published: (2026)

RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold
by: Setlur, Amrith, et al.
Published: (2024)

Revisiting Observation Reduction for Web Agents: Comprehensive Evaluation with a Lightweight Framework
by: Enomoto, Masafumi, et al.
Published: (2026)

BioPars: A Pretrained Biomedical Large Language Model for Persian Biomedical Text Mining
by: Merzah, Baqer M., et al.
Published: (2025)

Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization
by: Kawakami, Wataru, et al.
Published: (2025)

Deep Active Learning for Data Mining from Conflict Text Corpora
by: Croicu, Mihai
Published: (2024)

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates
by: Yang, Ling, et al.
Published: (2025)

Enhancing Multilingual LLM Pretraining with Model-Based Data Selection
by: Messmer, Bettina, et al.
Published: (2025)

Thought Branches: Interpreting LLM Reasoning Requires Resampling
by: Macar, Uzay, et al.
Published: (2025)

Thought Anchors: Which LLM Reasoning Steps Matter?
by: Bogdan, Paul C., et al.
Published: (2025)

Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pretraining
by: Li, Jeffrey, et al.
Published: (2026)

Faithfulness as Information Flow: Evaluating and Training Faithful Chain-of-Thought Reasoning
by: Jia, Jinghan, et al.
Published: (2026)

Weight-of-Thought Reasoning: Exploring Neural Network Weights for Enhanced LLM Reasoning
by: Punjwani, Saif, et al.
Published: (2025)

MedSyn: LLM-based Synthetic Medical Text Generation Framework
by: Kumichev, Gleb, et al.
Published: (2024)

On Synthesizing Data for Context Attribution in Question Answering
by: Radevski, Gorjan, et al.
Published: (2025)

Ulterior Motives: Detecting Misaligned Reasoning in Continuous Thought Models
by: Ramjee, Sharan
Published: (2026)

TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
by: Li, Jeffrey, et al.
Published: (2025)

Accelerating Unbiased LLM Evaluation via Synthetic Feedback
by: Zhou, Zhaoyi, et al.
Published: (2025)

Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection
by: Naseh, Ali, et al.
Published: (2025)

Best-of-$\infty$ -- Asymptotic Performance of Test-Time LLM Ensembling
by: Komiyama, Junpei, et al.
Published: (2025)

The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
by: Saada, Thiziri Nait, et al.
Published: (2025)

Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
by: Aytes, Simon A., et al.
Published: (2025)

Automated Text Mining of Experimental Methodologies from Biomedical Literature
by: Guo, Ziqing
Published: (2024)

When Chain-of-Thought Fails, the Solution Hides in the Hidden States
by: Mehrafarin, Houman, et al.
Published: (2026)

Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content
by: Touchent, Rian, et al.
Published: (2025)