Saved in:
| Main Authors: | Chang, Yen-Hsiang, Pu, Jianhao, Hwu, Wen-mei, Xiong, Jinjun |
|---|---|
| Format: | Preprint |
| Published: |
2021
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2111.05231 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Parallelizing Maximal Clique Enumeration on GPUs
by: Almasri, Mohammad, et al.
Published: (2022)
by: Almasri, Mohammad, et al.
Published: (2022)
An MLCommons Scientific Benchmarks Ontology
by: Hawks, Ben, et al.
Published: (2025)
by: Hawks, Ben, et al.
Published: (2025)
Improvements & Evaluations on the MLCommons CloudMask Benchmark
by: Chennamsetti, Varshitha, et al.
Published: (2024)
by: Chennamsetti, Varshitha, et al.
Published: (2024)
Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses
by: Park, Jeongmin Brian, et al.
Published: (2023)
by: Park, Jeongmin Brian, et al.
Published: (2023)
LSM-GNN: Large-scale Storage-based Multi-GPU GNN Training by Optimizing Data Transfer Scheme
by: Park, Jeongmin Brian, et al.
Published: (2024)
by: Park, Jeongmin Brian, et al.
Published: (2024)
MLCommons Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces
by: Sridharan, Srinivas, et al.
Published: (2026)
by: Sridharan, Srinivas, et al.
Published: (2026)
xMLP: Revolutionizing Private Inference with Exclusive Square Activation
by: Li, Jiajie, et al.
Published: (2024)
by: Li, Jiajie, et al.
Published: (2024)
Achieving the Asymptotically Optimal Sample Complexity of Offline Reinforcement Learning: A DRO-Based Approach
by: Wang, Yue, et al.
Published: (2023)
by: Wang, Yue, et al.
Published: (2023)
Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level
by: Hassani, Ali, et al.
Published: (2024)
by: Hassani, Ali, et al.
Published: (2024)
FP64 is All You Need: Rethinking Failure Modes in Physics-Informed Neural Networks
by: Xu, Chenhui, et al.
Published: (2025)
by: Xu, Chenhui, et al.
Published: (2025)
VibeTensor: System Software for Deep Learning, Fully Generated by AI Agents
by: Xu, Bing, et al.
Published: (2026)
by: Xu, Bing, et al.
Published: (2026)
Rollback-Free Stable Brick Structures Generation
by: Xu, Chenhui, et al.
Published: (2026)
by: Xu, Chenhui, et al.
Published: (2026)
Forecasting the U.S. Treasury Yield Curve: A Distributionally Robust Machine Learning Approach
by: Liu, Jinjun, et al.
Published: (2026)
by: Liu, Jinjun, et al.
Published: (2026)
A Multi-Perspective Architecture for Semantic Code Search
by: Haldar, Rajarshi, et al.
Published: (2020)
by: Haldar, Rajarshi, et al.
Published: (2020)
Beyond $\ell_2$-norm and $\ell_\infty$-norm: A Curvature-Inspired $\ell_p$-Norm Scheme for Deep Neural Networks
by: Xu, Jianhao, et al.
Published: (2026)
by: Xu, Jianhao, et al.
Published: (2026)
QuadraNet V2: Efficient and Sustainable Training of High-Order Neural Networks with Quadratic Adaptation
by: Xu, Chenhui, et al.
Published: (2024)
by: Xu, Chenhui, et al.
Published: (2024)
Ensembler: Protect Collaborative Inference Privacy from Model Inversion Attack via Selective Ensemble
by: Liu, Dancheng, et al.
Published: (2024)
by: Liu, Dancheng, et al.
Published: (2024)
Auto-Prompt Ensemble for LLM Judge
by: Li, Jiajie, et al.
Published: (2025)
by: Li, Jiajie, et al.
Published: (2025)
Demonstration-Free Robotic Control via LLM Agents
by: Tsui, Brian Y., et al.
Published: (2026)
by: Tsui, Brian Y., et al.
Published: (2026)
LLM4FS: Leveraging Large Language Models for Feature Selection
by: Li, Jianhao, et al.
Published: (2025)
by: Li, Jianhao, et al.
Published: (2025)
AUGUSTUS: An LLM-Driven Multimodal Agent System with Contextualized User Memory
by: Jain, Jitesh, et al.
Published: (2025)
by: Jain, Jitesh, et al.
Published: (2025)
Infinite-Dimensional Feature Interaction
by: Xu, Chenhui, et al.
Published: (2024)
by: Xu, Chenhui, et al.
Published: (2024)
FL-NAS: Towards Fairness of NAS for Resource Constrained Devices via Large Language Models
by: Qin, Ruiyang, et al.
Published: (2024)
by: Qin, Ruiyang, et al.
Published: (2024)
A Survey of Data Synthesis Approaches
by: Chang, Hsin-Yu, et al.
Published: (2024)
by: Chang, Hsin-Yu, et al.
Published: (2024)
STLGT: A Scalable Trace-Based Linear Graph Transformer for Tail Latency Prediction in Microservices
by: Ding, Yongliang, et al.
Published: (2026)
by: Ding, Yongliang, et al.
Published: (2026)
MaskOpt: A Large-Scale Mask Optimization Dataset to Advance AI in Integrated Circuit Manufacturing
by: Hu, Yuting, et al.
Published: (2025)
by: Hu, Yuting, et al.
Published: (2025)
Can Learning Be Explained By Local Optimality In Robust Low-rank Matrix Recovery?
by: Ma, Jianhao, et al.
Published: (2023)
by: Ma, Jianhao, et al.
Published: (2023)
Convergence of Gradient Descent with Small Initialization for Unregularized Matrix Completion
by: Ma, Jianhao, et al.
Published: (2024)
by: Ma, Jianhao, et al.
Published: (2024)
Tuning the Implicit Regularizer of Masked Diffusion Language Models: Enhancing Generalization via Insights from $k$-Parity
by: Huang, Jianhao, et al.
Published: (2026)
by: Huang, Jianhao, et al.
Published: (2026)
A PyTorch Framework for Scalable Non-Crossing Quantile Regression
by: Chang, Kaihua
Published: (2025)
by: Chang, Kaihua
Published: (2025)
A Scalable Algorithm for Active Learning
by: Chen, Youguang, et al.
Published: (2024)
by: Chen, Youguang, et al.
Published: (2024)
MLCommons Cloud Masking Benchmark with Early Stopping
by: Chennamsetti, Varshitha, et al.
Published: (2023)
by: Chennamsetti, Varshitha, et al.
Published: (2023)
Towards Understanding Multi-Round Large Language Model Reasoning: Approximability, Learnability and Generalizability
by: Xu, Chenhui, et al.
Published: (2025)
by: Xu, Chenhui, et al.
Published: (2025)
IRB: Automated Generation of Robust Factuality Benchmarks
by: Do, Lam Thanh, et al.
Published: (2026)
by: Do, Lam Thanh, et al.
Published: (2026)
Benchmarking In-context Experiential Learning Through Repeated Product Recommendations
by: Yang, Gilbert, et al.
Published: (2025)
by: Yang, Gilbert, et al.
Published: (2025)
Sub-Sequential Physics-Informed Learning with State Space Model
by: Xu, Chenhui, et al.
Published: (2025)
by: Xu, Chenhui, et al.
Published: (2025)
Detecting and Ranking Causal Anomalies in End-to-End Complex System
by: Chang, Ching, et al.
Published: (2023)
by: Chang, Ching, et al.
Published: (2023)
From Global to Local: A Scalable Benchmark for Local Posterior Sampling
by: Hitchcock, Rohan, et al.
Published: (2025)
by: Hitchcock, Rohan, et al.
Published: (2025)
Quantum Algorithms for Projection-Free Sparse Convex Optimization
by: He, Jianhao, et al.
Published: (2025)
by: He, Jianhao, et al.
Published: (2025)
UniZero: Generalized and Efficient Planning with Scalable Latent World Models
by: Pu, Yuan, et al.
Published: (2024)
by: Pu, Yuan, et al.
Published: (2024)
Similar Items
-
Parallelizing Maximal Clique Enumeration on GPUs
by: Almasri, Mohammad, et al.
Published: (2022) -
An MLCommons Scientific Benchmarks Ontology
by: Hawks, Ben, et al.
Published: (2025) -
Improvements & Evaluations on the MLCommons CloudMask Benchmark
by: Chennamsetti, Varshitha, et al.
Published: (2024) -
Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses
by: Park, Jeongmin Brian, et al.
Published: (2023) -
LSM-GNN: Large-scale Storage-based Multi-GPU GNN Training by Optimizing Data Transfer Scheme
by: Park, Jeongmin Brian, et al.
Published: (2024)