:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wan, Cheng, Tao, Runkai, Du, Zheng, Zhao, Yang Katie, Lin, Yingyan Celine
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2501.01951
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
by: You, Haoran, et al.
Published: (2023)

MLC-GCN: Multi-Level Generated Connectome Based GCN for AD Analysis
by: Zhu, Wenqi, et al.
Published: (2024)

MoNTA: Accelerating Mixture-of-Experts Training with Network-Traffc-Aware Parallel Optimization
by: Guo, Jingming, et al.
Published: (2024)

PromptGCN: Bridging Subgraph Gaps in Lightweight GCNs
by: Ji, Shengwei, et al.
Published: (2024)

Hybrid GCN-GRU Model for Anomaly Detection in Cryptocurrency Transactions
by: Na, Gyuyeon, et al.
Published: (2025)

Combining GCN Structural Learning with LLM Chemical Knowledge for Enhanced Virtual Screening
by: Berreziga, Radia, et al.
Published: (2025)

MoR: Mixture Of Representations For Mixed-Precision Training
by: Su, Bor-Yiing, et al.
Published: (2025)

MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation
by: Zhang, Yongan, et al.
Published: (2024)

LUMINA: Laplacian-Unifying Mechanism for Interpretable Neurodevelopmental Analysis via Quad-Stream GCN
by: Cha, Minkyung, et al.
Published: (2026)

Just Propagate: Unifying Matrix Factorization, Network Embedding, and LightGCN for Link Prediction
by: Liu, Haoxin
Published: (2024)

Surface EMG Profiling in Parkinson's Disease: Advancing Severity Assessment with GCN-SVM
by: Cieślak, Daniel, et al.
Published: (2025)

Resilient Temporal GCN for Smart Grid State Estimation Under Topology Inaccuracies
by: Haghshenas, Seyed Hamed, et al.
Published: (2024)

P4GCN: Vertical Federated Social Recommendation with Privacy-Preserving Two-Party Graph Convolution Network
by: Wang, Zheng, et al.
Published: (2024)

MergeMix: Optimizing Mid-Training Data Mixtures via Learnable Model Merging
by: Wang, Jiapeng, et al.
Published: (2026)

WEST GCN-LSTM: Weighted Stacked Spatio-Temporal Graph Neural Networks for Regional Traffic Forecasting
by: Theodoropoulos, Theodoros, et al.
Published: (2024)

RGE-GCN: Recursive Gene Elimination with Graph Convolutional Networks for RNA-seq based Early Cancer Detection
by: Shende, Shreyas, et al.
Published: (2025)

Meta-GCN: A Dynamically Weighted Loss Minimization Method for Dealing with the Data Imbalance in Graph Neural Networks
by: Mohammadizadeh, Mahdi, et al.
Published: (2024)

BlockBatch: Multi-Scale Consensus Decoding for Efficient Diffusion Language Model Inference
by: Wu, Xiaoyou, et al.
Published: (2026)

Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
by: Guo, Yongxin, et al.
Published: (2024)

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
by: You, Haoran, et al.
Published: (2024)

Early-Bird GCNs: Graph-Network Co-Optimization Towards More Efficient GCN Training and Inference via Drawing Early-Bird Lottery Tickets
by: You, Haoran, et al.
Published: (2021)

Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference
by: Chu, Kexin, et al.
Published: (2025)

MA2GCN: Multi Adjacency relationship Attention Graph Convolutional Networks for Traffic Prediction using Trajectory data
by: Sun, Zhengke, et al.
Published: (2024)

Speculating Experts Accelerates Inference for Mixture-of-Experts
by: Madan, Vivan, et al.
Published: (2026)

MixtureKit: A General Framework for Composing, Training, and Visualizing Mixture-of-Experts Models
by: Chamma, Ahmad, et al.
Published: (2025)

KAN-GCN: Combining Kolmogorov-Arnold Network with Graph Convolution Network for an Accurate Ice Sheet Emulator
by: Liu, Zesheng, et al.
Published: (2025)

When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
by: You, Haoran, et al.
Published: (2024)

Elastic MoE: Unlocking the Inference-Time Scalability of Mixture-of-Experts
by: Gu, Naibin, et al.
Published: (2025)

Score-of-Mixture Training: Training One-Step Generative Models Made Simple via Score Estimation of Mixture Distributions
by: Jayashankar, Tejas, et al.
Published: (2025)

SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models
by: Tang, Anke, et al.
Published: (2024)

ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks
by: You, Haoran, et al.
Published: (2022)

MoBiE: Efficient Inference of Mixture of Binary Experts under Post-Training Quantization
by: Zhao, Zhixiong, et al.
Published: (2026)

Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts
by: Nguyen, Xuan-Phi, et al.
Published: (2026)

BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference
by: Jin, Zewen, et al.
Published: (2025)

Accelerating Mixture-of-Expert Inference with Adaptive Expert Split Mechanism
by: Yan, Jiaming, et al.
Published: (2025)

N-vium: Mixture-of-Exits Transformer for Accelerated Exact Generation
by: Lorenc, Aleksander, et al.
Published: (2026)

Mixture of Experts in a Mixture of RL settings
by: Willi, Timon, et al.
Published: (2024)

MoE-DisCo:Low Economy Cost Training Mixture-of-Experts Models
by: Ye, Xin, et al.
Published: (2026)

An Adaptive Placement and Parallelism Framework for Accelerating RLHF Training
by: Xiao, Youshao, et al.
Published: (2023)

Dense Backpropagation Improves Training for Sparse Mixture-of-Experts
by: Panda, Ashwinee, et al.
Published: (2025)