:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zeng, Zhanpeng, Davies, Michael, Pulijala, Pranav, Sankaralingam, Karthikeyan, Singh, Vikas
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2403.07221
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

IM-Unpack: Training and Inference with Arbitrarily Low Precision Integers
by: Zeng, Zhanpeng, et al.
Published: (2024)

Computer Architecture's AlphaZero Moment: Automated Discovery in an Encircled World
by: Sankaralingam, Karthikeyan
Published: (2026)

NeuroScalar: A Deep Learning Framework for Fast, Accurate, and In-the-Wild Cycle-Level Performance Prediction
by: Wadle, Shayne, et al.
Published: (2025)

FrameQuant: Flexible Low-Bit Quantization for Transformers
by: Adepu, Harshavardhan, et al.
Published: (2024)

Empirical Bayes Conformal Prediction for Vision and Language Models
by: Zeng, Jiapeng, et al.
Published: (2026)

Spark Transformer: Reactivating Sparsity in FFN and Attention
by: You, Chong, et al.
Published: (2025)

FFN Fusion: Rethinking Sequential Computation in Large Language Models
by: Bercovich, Akhiad, et al.
Published: (2025)

Sparsity Moves Computation: How FFN Architecture Reshapes Attention in Small Transformers
by: Smithline, Gabriel, et al.
Published: (2026)

The Impact Market to Save Conference Peer Review: Decoupling Dissemination and Credentialing
by: Sankaralingam, Karthikeyan
Published: (2025)

Pedagogically Motivated and Composable Open-Source RISC-V Processors for Computer Science Education
by: McDougall, Ian, et al.
Published: (2025)

Causal inference and model explainability tools for retail
by: Gupta, Pranav, et al.
Published: (2025)

LOOKAT: Lookup-Optimized Key-Attention for Memory-Efficient Transformers
by: Karmore, Aryan
Published: (2026)

Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time Series Forecasting
by: Zhao, Yanjun, et al.
Published: (2024)

Mixture of Lookup Experts
by: Jie, Shibo, et al.
Published: (2025)

UMoE: Unifying Attention and FFN with Shared Experts
by: Yang, Yuanhang, et al.
Published: (2025)

Mixture of Lookup Key-Value Experts
by: Wang, Zongcheng
Published: (2025)

Fast Forward: Accelerating LLM Prefill with Predictive FFN Sparsity
by: Gautam, Aayush, et al.
Published: (2026)

LIMINAL: Exploring The Frontiers of LLM Decode Performance
by: Davies, Michael, et al.
Published: (2025)

mHC-lite: You Don't Need 20 Sinkhorn-Knopp Iterations
by: Yang, Yongyi, et al.
Published: (2026)

Analytical FFN-to-MoE Restructuring via Activation Pattern Analysis
by: Pei, Zehua, et al.
Published: (2025)

L$^3$: Large Lookup Layers
by: Tseng, Albert, et al.
Published: (2026)

Analytical Provisioning for Attention-FFN Disaggregated LLM Serving under Stochastic Workloads
by: Song, Chendong, et al.
Published: (2026)

Hawk: Accurate and Fast Privacy-Preserving Machine Learning Using Secure Lookup Table Computation
by: Saleem, Hamza, et al.
Published: (2024)

Kitsune: Enabling Dataflow Execution on GPUs
by: Davies, Michael, et al.
Published: (2025)

SAHM: State-Aware Heterogeneous Multicore for Single-Thread Performance
by: Wadle, Shayne, et al.
Published: (2025)

Analytical Exploration of Spatial Audio Cues: A Differentiable Multi-Sphere Scattering Model
by: Galougah, Siminfar Samakoush, et al.
Published: (2026)

Lookup multivariate Kolmogorov-Arnold Networks
by: Pozdnyakov, Sergey, et al.
Published: (2025)

Variable feature weighted fuzzy k-means algorithm for high dimensional data
by: Singh, Vikas, et al.
Published: (2019)

Enhancing Machine Learning for Imbalanced Medical Data: A Quantum-Inspired Approach to Synthetic Oversampling (QI-SMOTE)
by: Kashtriya, Vikas, et al.
Published: (2025)

FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping
by: Jaiswal, Ajay, et al.
Published: (2024)

TabConv: Low-Computation CNN Inference via Table Lookups
by: Gupta, Neelesh, et al.
Published: (2024)

Learning from Biased and Costly Data Sources: Minimax-optimal Data Collection under a Budget
by: Harding, Michael O., et al.
Published: (2026)

CAPA: Contribution-Aware Pruning and FFN Approximation for Efficient Large Vision-Language Models
by: Jha, Samyak, et al.
Published: (2026)

RevFFN: Memory-Efficient Full-Parameter Fine-Tuning of Mixture-of-Experts LLMs with Reversible Blocks
by: Liu, Ningyuan, et al.
Published: (2025)

Perturbation Probing: A Two-Pass-per-Prompt Diagnostic for FFN Behavioral Circuits in Aligned LLMs
by: Liu, Hongliang, et al.
Published: (2026)

BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity
by: Song, Chenyang, et al.
Published: (2025)

Activation Outliers in Transformer Quantization: Reproduction, Statistical Analysis, and Deployment Tradeoffs
by: Kaliaperumal, Pranav Kumar
Published: (2026)

Beyond Static Policies: Exploring Dynamic Policy Selection for Single-Thread Performance Optimization
by: Zhang, Yanxin, et al.
Published: (2026)

On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
by: Li, Bingrui, et al.
Published: (2024)

Score-based Causal Representation Learning: Linear and General Transformations
by: Varıcı, Burak, et al.
Published: (2024)