:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kim, Myung Jun, Lefebvre, Félix, Brison, Gaëtan, Perez-Lebel, Alexandre, Varoquaux, Gaël
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2505.14415
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Scalable Feature Learning on Huge Knowledge Graphs for Downstream Machine Learning
by: Lefebvre, Félix, et al.
Published: (2025)

CARTE: Pretraining and Transfer for Tabular Learning
by: Kim, Myung Jun, et al.
Published: (2024)

Retrieve, Merge, Predict: Augmenting Tables with Data Lakes
by: Cappuzzo, Riccardo, et al.
Published: (2024)

Decision from Suboptimal Classifiers: Excess Risk Pre- and Post-Calibration
by: Perez-Lebel, Alexandre, et al.
Published: (2025)

TabICLv2: A better, faster, scalable, and open tabular foundation model
by: Qu, Jingang, et al.
Published: (2026)

STRABLE: Benchmarking Tabular Machine Learning with Strings
by: Blayer, Gioia, et al.
Published: (2026)

TabICL: A Tabular Foundation Model for In-Context Learning on Large Data
by: Qu, Jingang, et al.
Published: (2025)

Imputation for prediction: beware of diminishing returns
by: Morvan, Marine Le, et al.
Published: (2024)

On the consistency of supervised learning with missing values
by: Josse, Julie, et al.
Published: (2019)

Reconfidencing LLMs from the Grouping Loss Perspective
by: Chen, Lihu, et al.
Published: (2024)

Extraction of linearized models from pre-trained networks via knowledge distillation
by: Kimura, Fumito, et al.
Published: (2026)

Survival Models: Proper Scoring Rule and Stochastic Optimization with Competing Risks
by: Alberge, Julie, et al.
Published: (2024)

Convex space learning for tabular synthetic data generation
by: Mahendra, Manjunath, et al.
Published: (2024)

Distributionally robust self-supervised learning for tabular data
by: Ghosh, Shantanu, et al.
Published: (2024)

From pre-training to downstream performance: Does domain-specific pre-training make sense?
by: Krones, Felix
Published: (2026)

Understanding the limitations of self-supervised learning for tabular anomaly detection
by: Mai, Kimberly T., et al.
Published: (2023)

Information fusion strategy integrating pre-trained language model and contrastive learning for materials knowledge mining
by: Peng, Yongqian, et al.
Published: (2025)

To Each Metric Its Decoding: Post-Hoc Optimal Decision Rules of Probabilistic Hierarchical Classifiers
by: Plaud, Roman, et al.
Published: (2025)

Causal thinking for decision making on Electronic Health Records: why and how
by: Doutreligne, Matthieu, et al.
Published: (2023)

xRFM: Accurate, scalable, and interpretable feature learning models for tabular data
by: Beaglehole, Daniel, et al.
Published: (2025)

X-TIME: An in-memory engine for accelerating machine learning on tabular data with CAMs
by: Pedretti, Giacomo, et al.
Published: (2023)

Deep generative models as an adversarial attack strategy for tabular machine learning
by: Dyrmishi, Salijona, et al.
Published: (2024)

Anchor-based oversampling for imbalanced tabular data via contrastive and adversarial learning
by: Mohammadi, Hadi, et al.
Published: (2025)

Post-pre-training for Modality Alignment in Vision-Language Foundation Models
by: Yamaguchi, Shin'ya, et al.
Published: (2025)

Dependency-aware synthetic tabular data generation
by: Umesh, Chaithra, et al.
Published: (2025)

DAGAF: A directed acyclic generative adversarial framework for joint structure learning and tabular data synthesis
by: Petkov, Hristo, et al.
Published: (2026)

SeBA: Semi-supervised few-shot learning via Separated-at-Birth Alignment for tabular data
by: Jurek, Kacper, et al.
Published: (2026)

Woosh: A Sound Effects Foundation Model
by: Hadjeres, Gaëtan, et al.
Published: (2026)

A supervised generative optimization approach for tabular data
by: Nakamura-Sakai, Shinpei, et al.
Published: (2023)

Data-efficient pre-training by scaling synthetic megadocs
by: Kim, Konwoo, et al.
Published: (2026)

Leveraging Intermediate Representations of Time Series Foundation Models for Anomaly Detection
by: Han, Chan Sik, et al.
Published: (2025)

MEDS-Tab: Automated tabularization and baseline methods for MEDS datasets
by: Oufattole, Nassim, et al.
Published: (2024)

Closing the gap on tabular data with Fourier and Implicit Categorical Features
by: Dragoi, Marius, et al.
Published: (2026)

TREB: a BERT attempt for imputing tabular data imputation
by: Wang, Shuyue, et al.
Published: (2024)

KGLink: A column type annotation method that combines knowledge graph and pre-trained language model
by: Wang, Yubo, et al.
Published: (2024)

Preserving logical and functional dependencies in synthetic tabular data
by: Umesh, Chaithra, et al.
Published: (2024)

Investigating the Sensitivity of Pre-trained Audio Embeddings to Common Effects
by: Deng, Victor, et al.
Published: (2025)

AutoG: Towards automatic graph construction from tabular data
by: Chen, Zhikai, et al.
Published: (2025)

TOC-UCO: a comprehensive repository of tabular ordinal classification datasets
by: Ayllón-Gavilán, Rafael, et al.
Published: (2025)

Targeted synthetic data generation for tabular data via hardness characterization
by: Ferracci, Tommaso, et al.
Published: (2024)