:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kinoshita, Yuri, Nishikawa, Naoki, Toyoizumi, Taro
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2603.14830
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A provable control of sensitivity of neural networks through a direct parameterization of the overall bi-Lipschitzness
by: Kinoshita, Yuri, et al.
Published: (2024)

Mixture of Experts Provably Detect and Learn the Latent Cluster Structure in Gradient-Based Learning
by: Kawata, Ryotaro, et al.
Published: (2025)

Degrees of Freedom for Linear Attention: Distilling Softmax Attention with Optimal Feature Efficiency
by: Nishikawa, Naoki, et al.
Published: (2025)

Cortex and subcortex play distinct roles over learning when cortical memory is limited
by: Farrell, Matthew, et al.
Published: (2026)

State Space Models are Provably Comparable to Transformers in Dynamic Token Selection
by: Nishikawa, Naoki, et al.
Published: (2024)

Gradient-Based Non-Linear Inverse Learning
by: Abhishake, et al.
Published: (2024)

Distilling Linearized Behavior into Non-Linear Fine-Tuning for Effective Task Arithmetic
by: Sommariva, Thomas, et al.
Published: (2026)

Sample-Efficient Linear Representation Learning from Non-IID Non-Isotropic Data
by: Zhang, Thomas T. C. K., et al.
Published: (2023)

Learning Task-Agnostic Representations through Multi-Teacher Distillation
by: Formont, Philippe, et al.
Published: (2025)

DataDAM: Efficient Dataset Distillation with Attention Matching
by: Sajedi, Ahmad, et al.
Published: (2023)

From Low Intrinsic Dimensionality to Non-Vacuous Generalization Bounds in Deep Multi-Task Learning
by: Zakerinia, Hossein, et al.
Published: (2025)

Optimal Task Order for Continual Learning of Multiple Tasks
by: Li, Ziyan, et al.
Published: (2025)

Learning Shared Representations for Multi-Task Linear Bandits
by: Lin, Jiabin, et al.
Published: (2026)

Multi-Task Representation Learning for Conservative Linear Bandits
by: Lin, Jiabin, et al.
Published: (2026)

Causality-Induced Positional Encoding for Transformer-Based Representation Learning of Non-Sequential Features
by: Xu, Kaichen, et al.
Published: (2025)

Disentangling and Mitigating the Impact of Task Similarity for Continual Learning
by: Hiratani, Naoki
Published: (2024)

Reshaping Neural Representation via Associative, Presynaptic Short-Term Plasticity
by: Shimizu, Genki, et al.
Published: (2026)

Learning Dynamical Systems Encoding Non-Linearity within Space Curvature
by: Fichera, Bernardo, et al.
Published: (2024)

Near-optimal and Efficient First-Order Algorithm for Multi-Task Learning with Shared Linear Representation
by: Ding, Shihong, et al.
Published: (2026)

Blurred Encoding for Trajectory Representation Learning
by: Zhou, Silin, et al.
Published: (2025)

Comparison of Autoencoder Encodings for ECG Representation in Downstream Prediction Tasks
by: Harvey, Christopher J., et al.
Published: (2024)

What is Dataset Distillation Learning?
by: Yang, William, et al.
Published: (2024)

Random Gradient-Free Optimization in Infinite Dimensional Spaces
by: Peixoto, Caio Lins, et al.
Published: (2025)

Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context
by: Cheng, Xiang, et al.
Published: (2023)

Dataset Distillation-based Hybrid Federated Learning on Non-IID Data
by: Shi, Xiufang, et al.
Published: (2024)

Spectral Gradient Surgery for Domain-Generalizable Dataset Distillation
by: Oh, Minyoung, et al.
Published: (2026)

Large Language Models Encode Semantics and Alignment in Linearly Separable Representations
by: Saglam, Baturay, et al.
Published: (2025)

Mask-Encoded Sparsification: Mitigating Biased Gradients in Communication-Efficient Split Learning
by: Zhou, Wenxuan, et al.
Published: (2024)

On the Diversity and Realism of Distilled Dataset: An Efficient Dataset Distillation Paradigm
by: Sun, Peng, et al.
Published: (2023)

Learning Linear Regression with Low-Rank Tasks in-Context
by: Takanami, Kaito, et al.
Published: (2025)

LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
by: Robert, Thomas, et al.
Published: (2024)

Learning from Linear Algebra: A Graph Neural Network Approach to Preconditioner Design for Conjugate Gradient Solvers
by: Trifonov, Vladislav, et al.
Published: (2024)

Data-to-Model Distillation: Data-Efficient Learning Framework
by: Sajedi, Ahmad, et al.
Published: (2024)

Dynamics and Representation Structure of Local Approximations to Gradient-Based Learning in Linear Recurrent Neural Networks
by: Williams, Ezekiel, et al.
Published: (2026)

Chebyshev Policies and the Mountain Car Problem: Reinforcement Learning for Low-Dimensional Control Tasks
by: Huber, Stefan, et al.
Published: (2026)

Nonparametric Instrumental Variable Regression through Stochastic Approximate Gradients
by: Fonseca, Yuri, et al.
Published: (2024)

On Learning Representations for Tabular Data Distillation
by: Kang, Inwon, et al.
Published: (2025)

High-Dimensional Search, Low-Dimensional Solution: Decoupling Optimization from Representation
by: Kalyoncuoglu, Yusuf, et al.
Published: (2025)

Learning to Flow from Generative Pretext Tasks for Neural Architecture Encoding
by: Kim, Sunwoo, et al.
Published: (2025)

Exploring the Potential of QEEGNet for Cross-Task and Cross-Dataset Electroencephalography Encoding with Quantum Machine Learning
by: Chen, Chi-Sheng, et al.
Published: (2025)