Saved in:
| Main Authors: | Chen, Yang, Fang, Cong, Lin, Zhouchen, Liu, Bing |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.11249 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DIGIC: Domain Generalizable Imitation Learning by Causal Discovery
by: Chen, Yang, et al.
Published: (2024)
by: Chen, Yang, et al.
Published: (2024)
Conda: Column-Normalized Adam for Training Large Language Models Faster
by: Wang, Junjie, et al.
Published: (2025)
by: Wang, Junjie, et al.
Published: (2025)
On the Limitations and Capabilities of Position Embeddings for Length Generalization
by: Chen, Yang, et al.
Published: (2025)
by: Chen, Yang, et al.
Published: (2025)
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
by: Wu, Zhoutong, et al.
Published: (2025)
by: Wu, Zhoutong, et al.
Published: (2025)
Attention Sinks Induce Gradient Sinks: Massive Activations as Gradient Regulators in Transformers
by: Chen, Yihong, et al.
Published: (2026)
by: Chen, Yihong, et al.
Published: (2026)
Training-Free Message Passing for Learning on Hypergraphs
by: Tang, Bohan, et al.
Published: (2024)
by: Tang, Bohan, et al.
Published: (2024)
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
by: Liu, Renpu, et al.
Published: (2024)
by: Liu, Renpu, et al.
Published: (2024)
Quadratic Direct Forecast for Training Multi-Step Time-Series Forecast Models
by: Wang, Hao, et al.
Published: (2025)
by: Wang, Hao, et al.
Published: (2025)
Generalist++: A Meta-learning Framework for Mitigating Trade-off in Adversarial Training
by: Wang, Yisen, et al.
Published: (2025)
by: Wang, Yisen, et al.
Published: (2025)
Link Prediction with Relational Hypergraphs
by: Huang, Xingyue, et al.
Published: (2024)
by: Huang, Xingyue, et al.
Published: (2024)
Proximity Matters: Local Proximity Enhanced Balancing for Treatment Effect Estimation
by: Wang, Hao, et al.
Published: (2024)
by: Wang, Hao, et al.
Published: (2024)
Hyperbolic Hypergraph Neural Networks for Multi-Relational Knowledge Hypergraph Representation
by: Li, Mengfan, et al.
Published: (2024)
by: Li, Mengfan, et al.
Published: (2024)
Protecting Copyright of Medical Pre-trained Language Models: Training-Free Backdoor Model Watermarking
by: Kong, Cong, et al.
Published: (2024)
by: Kong, Cong, et al.
Published: (2024)
Online Pseudo-Zeroth-Order Training of Neuromorphic Spiking Neural Networks
by: Xiao, Mingqing, et al.
Published: (2024)
by: Xiao, Mingqing, et al.
Published: (2024)
FedAdamW: A Communication-Efficient Optimizer with Convergence and Generalization Guarantees for Federated Large Models
by: Liu, Junkang, et al.
Published: (2025)
by: Liu, Junkang, et al.
Published: (2025)
Simple Convergence Proof of Adam From a Sign-like Descent Perspective
by: Peng, Hanyang, et al.
Published: (2025)
by: Peng, Hanyang, et al.
Published: (2025)
Enhanced Atrial Fibrillation Prediction in ESUS Patients with Hypergraph-based Pre-training
by: Xie, Yuzhang, et al.
Published: (2026)
by: Xie, Yuzhang, et al.
Published: (2026)
Contrastive Language-Image Pre-Training Model based Semantic Communication Performance Optimization
by: Yang, Shaoran, et al.
Published: (2025)
by: Yang, Shaoran, et al.
Published: (2025)
Task-Aware Parameter-Efficient Fine-Tuning of Large Pre-Trained Models at the Edge
by: Hu, Senkang, et al.
Published: (2025)
by: Hu, Senkang, et al.
Published: (2025)
PTMs-TSCIL Pre-Trained Models Based Class-Incremental Learning
by: Wu, Yuanlong, et al.
Published: (2025)
by: Wu, Yuanlong, et al.
Published: (2025)
Theoretical Perspectives on Data Quality and Synergistic Effects in Pre- and Post-Training Reasoning Models
by: Javanmard, Adel, et al.
Published: (2026)
by: Javanmard, Adel, et al.
Published: (2026)
CyclicFL: A Cyclic Model Pre-Training Approach to Efficient Federated Learning
by: Zhang, Pengyu, et al.
Published: (2023)
by: Zhang, Pengyu, et al.
Published: (2023)
A Survey on Time-Series Pre-Trained Models
by: Ma, Qianli, et al.
Published: (2023)
by: Ma, Qianli, et al.
Published: (2023)
ADORA: Training Reasoning Models with Dynamic Advantage Estimation on Reinforcement Learning
by: Ren, Qingnan, et al.
Published: (2026)
by: Ren, Qingnan, et al.
Published: (2026)
HypergraphFormer: Learning Hypergraphs from LLMs for Editable Floor Plan Generation
by: Klimenko, Nikita, et al.
Published: (2026)
by: Klimenko, Nikita, et al.
Published: (2026)
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks
by: Xiao, Mingqing, et al.
Published: (2024)
by: Xiao, Mingqing, et al.
Published: (2024)
On the Learnability of Test-Time Adaptation: A Recovery Complexity Perspective
by: Zhou, Zhi, et al.
Published: (2026)
by: Zhou, Zhi, et al.
Published: (2026)
Reinforcement Learning on Pre-Training Data
by: Li, Siheng, et al.
Published: (2025)
by: Li, Siheng, et al.
Published: (2025)
On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models
by: Farhat, Sean, et al.
Published: (2024)
by: Farhat, Sean, et al.
Published: (2024)
Implicit Hypergraph Neural Networks: A Stable Framework for Higher-Order Relational Learning with Provable Guarantees
by: Li, Xiaoyu, et al.
Published: (2025)
by: Li, Xiaoyu, et al.
Published: (2025)
How Particle System Theory Enhances Hypergraph Message Passing
by: Ma, Yixuan, et al.
Published: (2025)
by: Ma, Yixuan, et al.
Published: (2025)
Combining Pre-Trained Models for Enhanced Feature Representation in Reinforcement Learning
by: Piccoli, Elia, et al.
Published: (2025)
by: Piccoli, Elia, et al.
Published: (2025)
Single Parent Family: A Spectrum of Family Members from a Single Pre-Trained Foundation Model
by: Hajimolahoseini, Habib, et al.
Published: (2024)
by: Hajimolahoseini, Habib, et al.
Published: (2024)
A Survey of Few-Shot Learning on Graphs: from Meta-Learning to Pre-Training and Prompt Learning
by: Yu, Xingtong, et al.
Published: (2024)
by: Yu, Xingtong, et al.
Published: (2024)
Ensemble of Pre-Trained Models for Long-Tailed Trajectory Prediction
by: Thuremella, Divya, et al.
Published: (2025)
by: Thuremella, Divya, et al.
Published: (2025)
CHGNN: A Semi-Supervised Contrastive Hypergraph Learning Network
by: Song, Yumeng, et al.
Published: (2023)
by: Song, Yumeng, et al.
Published: (2023)
SeqFusion: Sequential Fusion of Pre-Trained Models for Zero-Shot Time-Series Forecasting
by: Huang, Ting-Ji, et al.
Published: (2025)
by: Huang, Ting-Ji, et al.
Published: (2025)
A2PO: Towards Effective Offline Reinforcement Learning from an Advantage-aware Perspective
by: Qing, Yunpeng, et al.
Published: (2024)
by: Qing, Yunpeng, et al.
Published: (2024)
On the Cause of Unfairness: A Training Sample Perspective
by: Yao, Yuanshun, et al.
Published: (2023)
by: Yao, Yuanshun, et al.
Published: (2023)
On the Adversarial Transferability of Generalized "Skip Connections"
by: Wang, Yisen, et al.
Published: (2024)
by: Wang, Yisen, et al.
Published: (2024)
Similar Items
-
DIGIC: Domain Generalizable Imitation Learning by Causal Discovery
by: Chen, Yang, et al.
Published: (2024) -
Conda: Column-Normalized Adam for Training Large Language Models Faster
by: Wang, Junjie, et al.
Published: (2025) -
On the Limitations and Capabilities of Position Embeddings for Length Generalization
by: Chen, Yang, et al.
Published: (2025) -
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
by: Wu, Zhoutong, et al.
Published: (2025) -
Attention Sinks Induce Gradient Sinks: Massive Activations as Gradient Regulators in Transformers
by: Chen, Yihong, et al.
Published: (2026)