:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ashouri, Amir H., Manzoor, Muhammad Asif, Vu, Duc Minh, Zhang, Raymond, Toft, Colin, Wang, Ziwen, Zhang, Angel, Chan, Bryan, Czajkowski, Tomasz S., Gao, Yaoqing
Format:	Preprint
Published:	2023
Subjects:	Programming Languages Artificial Intelligence Machine Learning Performance I.2.5; D.3.0; I.2.6
Online Access:	https://arxiv.org/abs/2312.09982
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Protean Compiler: An Agile Framework to Drive Fine-grain Phase Ordering
by: Ashouri, Amir H., et al.
Published: (2026)

Towards Building Private LLMs: Exploring Multi-Node Expert Parallelism on Apple Silicon for Mixture-of-Experts Large Language Model
by: Chen, Mu-Chi, et al.
Published: (2025)

Protection Is (Nearly) All You Need: Structural Protection Dominates Scoring in Globally Capped KV Eviction
by: Garcia, Gabriel
Published: (2026)

Towards A Flexible Accuracy-Oriented Deep Learning Module Inference Latency Prediction Framework for Adaptive Optimization Algorithms
by: Shen, Jingran, et al.
Published: (2023)

Attention in SRAM on Tenstorrent Grayskull
by: Thüning, Moritz
Published: (2024)

When Is the Same Model Not the Same Service? A Measurement Study of Hosted Open-Weight LLM APIs
by: Li, Haorui, et al.
Published: (2026)

Efficient Construction of Large Search Spaces for Auto-Tuning
by: Willemsen, Floris-Jan, et al.
Published: (2025)

StreamIndex: Memory-Bounded Compressed Sparse Attention via Streaming Top-k
by: Jaber, Jaber, et al.
Published: (2026)

Hypergraph Overlapping Community Detection for Brain Networks
by: Vu, Duc, et al.
Published: (2025)

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models
by: Fu, Tianyu, et al.
Published: (2025)

Accelerating Transfer Function Update for Distance Map based Volume Rendering
by: Rauter, Michael, et al.
Published: (2024)

OrbitFlow: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache Reconfiguration
by: Ma, Xinyue, et al.
Published: (2026)

A Performance Evaluation of a Quantized Large Language Model on Various Smartphones
by: Çöplü, Tolga, et al.
Published: (2023)

Inducing Semi-Structured Sparsity by Masking for Efficient Model Inference in Convolutional Networks
by: Danhofer, David A.
Published: (2024)

GVE-LPA: Fast Label Propagation Algorithm (LPA) for Community Detection in Shared Memory Setting
by: Sahu, Subhajit
Published: (2023)

An Incrementally Expanding Approach for Updating PageRank on Dynamic Graphs
by: Sahu, Subhajit
Published: (2024)

DF* PageRank: Improved Incrementally Expanding Approaches for Updating PageRank on Dynamic Graphs
by: Sahu, Subhajit
Published: (2024)

GVE-Louvain: Fast Louvain Algorithm for Community Detection in Shared Memory Setting
by: Sahu, Subhajit
Published: (2023)

GVE-Leiden: Fast Leiden Algorithm for Community Detection in Shared Memory Setting
by: Sahu, Subhajit
Published: (2023)

Lock-Free Computation of PageRank in Dynamic Graphs
by: Sahu, Subhajit
Published: (2024)

Hierarchical Evaluation Function: A Multi-Metric Approach for Optimizing Demand Forecasting Models
by: González, Adolfo, et al.
Published: (2025)

MAS-Attention: Memory-Aware Stream Processing for Attention Acceleration on Resource-Constrained Edge Devices
by: Shakerdargah, Mohammadali, et al.
Published: (2024)

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
by: Fu, Tianyu, et al.
Published: (2025)

PRISMA: Preference-Reinforced Self-Training Approach for Interpretable Emotionally Intelligent Negotiation Dialogues
by: Kajare, Prajwal Vijay, et al.
Published: (2026)

GraphWalk: Enabling Reasoning in Large Language Models through Tool-Based Graph Navigation
by: Ghandi, Taraneh, et al.
Published: (2026)

MH-FSF: A Unified Framework for Overcoming Benchmarking and Reproducibility Limitations in Feature Selection Evaluation
by: Rocha, Vanderson, et al.
Published: (2025)

Unveiling Energy Efficiency in Deep Learning: Measurement, Prediction, and Scoring across Edge Devices
by: Tu, Xiaolong, et al.
Published: (2023)

Interpreting Structured Perturbations in Image Protection Methods for Diffusion Models
by: Martin, Michael R., et al.
Published: (2025)

Parallelization Strategies for Dense LLM Deployment: Navigating Through Application-Specific Tradeoffs and Bottlenecks
by: Topcu, Burak, et al.
Published: (2026)

Taking Flight with Dialogue: Enabling Natural Language Control for PX4-based Drone Agent
by: Lim, Shoon Kit, et al.
Published: (2025)

OFMU: Optimization-Driven Framework for Machine Unlearning
by: Asif, Sadia, et al.
Published: (2025)

Dream to Fly: Model-Based Reinforcement Learning for Vision-Based Drone Flight
by: Romero, Angel, et al.
Published: (2025)

Ultra-Reduced-Impact-Encased-Logging (URIEL): propose a new method for selective sustainable logging and post-harvest silvicultural treatment in tropical forest using airborne robotics systems
by: Albiero, Daniel, et al.
Published: (2026)

LLM-Assisted Formalization Enables Deterministic Detection of Statutory Inconsistency in the Internal Revenue Code
by: Yadamsuren, Borchuluun, et al.
Published: (2025)

FATHOMS-RAG: A Framework for the Assessment of Thinking and Observation in Multimodal Systems that use Retrieval Augmented Generation
by: Hildebrand, Samuel, et al.
Published: (2025)

WebSplatter: Enabling Cross-Device Efficient Gaussian Splatting in Web Browsers via WebGPU
by: Han, Yudong, et al.
Published: (2026)

CortexCompile: Harnessing Cortical-Inspired Architectures for Enhanced Multi-Agent NLP Code Synthesis
by: Ramachandran, Gautham, et al.
Published: (2024)

IMUVIE: Pickup Timeline Action Localization via Motion Movies
by: Clapham, John, et al.
Published: (2024)

HIP Network: Historical Information Passing Network for Extrapolation Reasoning on Temporal Knowledge Graph
by: He, Yongquan, et al.
Published: (2024)

SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs
by: Zhang, Yongkang, et al.
Published: (2024)