Saved in:
| Main Authors: | Kim, Myung Jun, Lefebvre, Félix, Brison, Gaëtan, Perez-Lebel, Alexandre, Varoquaux, Gaël |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.14415 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Scalable Feature Learning on Huge Knowledge Graphs for Downstream Machine Learning
by: Lefebvre, Félix, et al.
Published: (2025)
by: Lefebvre, Félix, et al.
Published: (2025)
CARTE: Pretraining and Transfer for Tabular Learning
by: Kim, Myung Jun, et al.
Published: (2024)
by: Kim, Myung Jun, et al.
Published: (2024)
Retrieve, Merge, Predict: Augmenting Tables with Data Lakes
by: Cappuzzo, Riccardo, et al.
Published: (2024)
by: Cappuzzo, Riccardo, et al.
Published: (2024)
Decision from Suboptimal Classifiers: Excess Risk Pre- and Post-Calibration
by: Perez-Lebel, Alexandre, et al.
Published: (2025)
by: Perez-Lebel, Alexandre, et al.
Published: (2025)
TabICLv2: A better, faster, scalable, and open tabular foundation model
by: Qu, Jingang, et al.
Published: (2026)
by: Qu, Jingang, et al.
Published: (2026)
STRABLE: Benchmarking Tabular Machine Learning with Strings
by: Blayer, Gioia, et al.
Published: (2026)
by: Blayer, Gioia, et al.
Published: (2026)
TabICL: A Tabular Foundation Model for In-Context Learning on Large Data
by: Qu, Jingang, et al.
Published: (2025)
by: Qu, Jingang, et al.
Published: (2025)
Imputation for prediction: beware of diminishing returns
by: Morvan, Marine Le, et al.
Published: (2024)
by: Morvan, Marine Le, et al.
Published: (2024)
On the consistency of supervised learning with missing values
by: Josse, Julie, et al.
Published: (2019)
by: Josse, Julie, et al.
Published: (2019)
Reconfidencing LLMs from the Grouping Loss Perspective
by: Chen, Lihu, et al.
Published: (2024)
by: Chen, Lihu, et al.
Published: (2024)
Extraction of linearized models from pre-trained networks via knowledge distillation
by: Kimura, Fumito, et al.
Published: (2026)
by: Kimura, Fumito, et al.
Published: (2026)
Survival Models: Proper Scoring Rule and Stochastic Optimization with Competing Risks
by: Alberge, Julie, et al.
Published: (2024)
by: Alberge, Julie, et al.
Published: (2024)
Convex space learning for tabular synthetic data generation
by: Mahendra, Manjunath, et al.
Published: (2024)
by: Mahendra, Manjunath, et al.
Published: (2024)
Distributionally robust self-supervised learning for tabular data
by: Ghosh, Shantanu, et al.
Published: (2024)
by: Ghosh, Shantanu, et al.
Published: (2024)
From pre-training to downstream performance: Does domain-specific pre-training make sense?
by: Krones, Felix
Published: (2026)
by: Krones, Felix
Published: (2026)
Understanding the limitations of self-supervised learning for tabular anomaly detection
by: Mai, Kimberly T., et al.
Published: (2023)
by: Mai, Kimberly T., et al.
Published: (2023)
Information fusion strategy integrating pre-trained language model and contrastive learning for materials knowledge mining
by: Peng, Yongqian, et al.
Published: (2025)
by: Peng, Yongqian, et al.
Published: (2025)
To Each Metric Its Decoding: Post-Hoc Optimal Decision Rules of Probabilistic Hierarchical Classifiers
by: Plaud, Roman, et al.
Published: (2025)
by: Plaud, Roman, et al.
Published: (2025)
Causal thinking for decision making on Electronic Health Records: why and how
by: Doutreligne, Matthieu, et al.
Published: (2023)
by: Doutreligne, Matthieu, et al.
Published: (2023)
xRFM: Accurate, scalable, and interpretable feature learning models for tabular data
by: Beaglehole, Daniel, et al.
Published: (2025)
by: Beaglehole, Daniel, et al.
Published: (2025)
X-TIME: An in-memory engine for accelerating machine learning on tabular data with CAMs
by: Pedretti, Giacomo, et al.
Published: (2023)
by: Pedretti, Giacomo, et al.
Published: (2023)
Deep generative models as an adversarial attack strategy for tabular machine learning
by: Dyrmishi, Salijona, et al.
Published: (2024)
by: Dyrmishi, Salijona, et al.
Published: (2024)
Anchor-based oversampling for imbalanced tabular data via contrastive and adversarial learning
by: Mohammadi, Hadi, et al.
Published: (2025)
by: Mohammadi, Hadi, et al.
Published: (2025)
Post-pre-training for Modality Alignment in Vision-Language Foundation Models
by: Yamaguchi, Shin'ya, et al.
Published: (2025)
by: Yamaguchi, Shin'ya, et al.
Published: (2025)
Dependency-aware synthetic tabular data generation
by: Umesh, Chaithra, et al.
Published: (2025)
by: Umesh, Chaithra, et al.
Published: (2025)
DAGAF: A directed acyclic generative adversarial framework for joint structure learning and tabular data synthesis
by: Petkov, Hristo, et al.
Published: (2026)
by: Petkov, Hristo, et al.
Published: (2026)
SeBA: Semi-supervised few-shot learning via Separated-at-Birth Alignment for tabular data
by: Jurek, Kacper, et al.
Published: (2026)
by: Jurek, Kacper, et al.
Published: (2026)
Woosh: A Sound Effects Foundation Model
by: Hadjeres, Gaëtan, et al.
Published: (2026)
by: Hadjeres, Gaëtan, et al.
Published: (2026)
A supervised generative optimization approach for tabular data
by: Nakamura-Sakai, Shinpei, et al.
Published: (2023)
by: Nakamura-Sakai, Shinpei, et al.
Published: (2023)
Data-efficient pre-training by scaling synthetic megadocs
by: Kim, Konwoo, et al.
Published: (2026)
by: Kim, Konwoo, et al.
Published: (2026)
Leveraging Intermediate Representations of Time Series Foundation Models for Anomaly Detection
by: Han, Chan Sik, et al.
Published: (2025)
by: Han, Chan Sik, et al.
Published: (2025)
MEDS-Tab: Automated tabularization and baseline methods for MEDS datasets
by: Oufattole, Nassim, et al.
Published: (2024)
by: Oufattole, Nassim, et al.
Published: (2024)
Closing the gap on tabular data with Fourier and Implicit Categorical Features
by: Dragoi, Marius, et al.
Published: (2026)
by: Dragoi, Marius, et al.
Published: (2026)
TREB: a BERT attempt for imputing tabular data imputation
by: Wang, Shuyue, et al.
Published: (2024)
by: Wang, Shuyue, et al.
Published: (2024)
KGLink: A column type annotation method that combines knowledge graph and pre-trained language model
by: Wang, Yubo, et al.
Published: (2024)
by: Wang, Yubo, et al.
Published: (2024)
Preserving logical and functional dependencies in synthetic tabular data
by: Umesh, Chaithra, et al.
Published: (2024)
by: Umesh, Chaithra, et al.
Published: (2024)
Investigating the Sensitivity of Pre-trained Audio Embeddings to Common Effects
by: Deng, Victor, et al.
Published: (2025)
by: Deng, Victor, et al.
Published: (2025)
AutoG: Towards automatic graph construction from tabular data
by: Chen, Zhikai, et al.
Published: (2025)
by: Chen, Zhikai, et al.
Published: (2025)
TOC-UCO: a comprehensive repository of tabular ordinal classification datasets
by: Ayllón-Gavilán, Rafael, et al.
Published: (2025)
by: Ayllón-Gavilán, Rafael, et al.
Published: (2025)
Targeted synthetic data generation for tabular data via hardness characterization
by: Ferracci, Tommaso, et al.
Published: (2024)
by: Ferracci, Tommaso, et al.
Published: (2024)
Similar Items
-
Scalable Feature Learning on Huge Knowledge Graphs for Downstream Machine Learning
by: Lefebvre, Félix, et al.
Published: (2025) -
CARTE: Pretraining and Transfer for Tabular Learning
by: Kim, Myung Jun, et al.
Published: (2024) -
Retrieve, Merge, Predict: Augmenting Tables with Data Lakes
by: Cappuzzo, Riccardo, et al.
Published: (2024) -
Decision from Suboptimal Classifiers: Excess Risk Pre- and Post-Calibration
by: Perez-Lebel, Alexandre, et al.
Published: (2025) -
TabICLv2: A better, faster, scalable, and open tabular foundation model
by: Qu, Jingang, et al.
Published: (2026)