:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Xiong, Yuqing
Format:	Preprint
Veröffentlicht:	2022
Schlagworte:	Distributed, Parallel, and Cluster Computing
Online-Zugang:	https://arxiv.org/abs/2204.06864
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

A Study of Performance Programming of CPU, GPU accelerated Computers and SIMD Architecture
von: Yi, Xinyao
Veröffentlicht: (2024)

AcOrch: Accelerating Sampling-based GNN Training under CPU-NPU Heterogeneous Environments
von: Chen, Kefu, et al.
Veröffentlicht: (2026)

A Unified CPU-GPU Protocol for GNN Training
von: Lin, Yi-Chien, et al.
Veröffentlicht: (2024)

Some New Approaches to MPI Implementations
von: Xiong, Yuqing
Veröffentlicht: (2024)

Towards Affordable, Adaptive and Automatic GNN Training on CPU-GPU Heterogeneous Platforms
von: Qiao, Tong, et al.
Veröffentlicht: (2025)

Dual-pronged deep learning preprocessing on heterogeneous platforms with CPU, Accelerator and CSD
von: Wei, Jia, et al.
Veröffentlicht: (2024)

WindVE: Collaborative CPU-NPU Vector Embedding
von: Huang, Jinqi, et al.
Veröffentlicht: (2025)

HeteroSTA: A CPU-GPU Heterogeneous Static Timing Analysis Engine with Holistic Industrial Design Support
von: Guo, Zizheng, et al.
Veröffentlicht: (2025)

Orchestrated Co-scheduling, Resource Partitioning, and Power Capping on CPU-GPU Heterogeneous Systems via Machine Learning
von: Saba, Issa, et al.
Veröffentlicht: (2024)

Dissecting CPU-GPU Unified Physical Memory on AMD MI300A APUs
von: Wahlgren, Jacob, et al.
Veröffentlicht: (2025)

Towards CXL Resilience to CPU Failures
von: Psistakis, Antonis, et al.
Veröffentlicht: (2026)

BitVMX: A CPU for Universal Computation on Bitcoin
von: Lerner, Sergio Demian, et al.
Veröffentlicht: (2024)

Evaluating SYCL as a Unified Programming Model for Heterogeneous Systems
von: Marowka, Ami
Veröffentlicht: (2026)

TURNIP: A "Nondeterministic" GPU Runtime with CPU RAM Offload
von: Ding, Zhimin, et al.
Veröffentlicht: (2024)

SPIN: Accelerating Large Language Model Inference with Heterogeneous Speculative Models
von: Chen, Fahao, et al.
Veröffentlicht: (2025)

Efficient Unified Caching for Accelerating Heterogeneous AI Workloads
von: Wang, Tianze, et al.
Veröffentlicht: (2025)

HiCR, an Abstract Model for Distributed Heterogeneous Programming
von: Martin, Sergio Miguel, et al.
Veröffentlicht: (2025)

Combining GPU and CPU for accelerating evolutionary computing workloads
von: Eynaliyev, Rustam, et al.
Veröffentlicht: (2025)

HybridGen: Efficient LLM Generative Inference via CPU-GPU Hybrid Computing
von: Lin, Mao, et al.
Veröffentlicht: (2026)

FLEX: Leveraging FPGA-CPU Synergy for Mixed-Cell-Height Legalization Acceleration
von: Liu, Xingyu, et al.
Veröffentlicht: (2025)

Optimizing Task Scheduling in Heterogeneous Computing Environments: A Comparative Analysis of CPU, GPU, and ASIC Platforms Using E2C Simulator
von: Mohammadjafari, Ali, et al.
Veröffentlicht: (2024)

Efficient Column-Wise N:M Pruning on RISC-V CPU
von: Chu, Chi-Wei, et al.
Veröffentlicht: (2025)

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration
von: Li, Zhonggen, et al.
Veröffentlicht: (2025)

Justin: Hybrid CPU/Memory Elastic Scaling for Distributed Stream Processing
von: Schmitz, Donatien, et al.
Veröffentlicht: (2025)

DiT-HC: Enabling Efficient Training of Visual Generation Model DiT on HPC-oriented CPU Cluster
von: Zhang, Jinxiao, et al.
Veröffentlicht: (2026)

Benchmarking of CPU-intensive Stream Data Processing in The Edge Computing Systems
von: Szydlo, Tomasz, et al.
Veröffentlicht: (2025)

Performance characterisation of the 64-core SG2042 RISC-V CPU for HPC
von: Brown, Nick, et al.
Veröffentlicht: (2024)

MMStencil: Optimizing High-order Stencils on Multicore CPU using Matrix Unit
von: Wang, Yinuo, et al.
Veröffentlicht: (2025)

Taming GPU Underutilization via Static Partitioning and Fine-grained CPU Offloading
von: Schieffer, Gabin, et al.
Veröffentlicht: (2026)

Accelerating Heterogeneous Tensor Parallelism via Flexible Workload Control
von: Wang, Zhigang, et al.
Veröffentlicht: (2024)

APEX: Asynchronous Parallel CPU-GPU Execution for Online LLM Inference on Constrained GPUs
von: Fan, Jiakun, et al.
Veröffentlicht: (2025)

Co-Design and Evaluation of a CPU-Free MPI GPU Communication Abstraction and Implementation
von: Bridges, Patrick G., et al.
Veröffentlicht: (2026)

Serving Hybrid LLM Loads with SLO Guarantees Using CPU-GPU Attention Piggybacking
von: Mo, Zizhao, et al.
Veröffentlicht: (2026)

Aging-aware CPU Core Management for Embodied Carbon Amortization in Cloud LLM Inference
von: Hewage, Tharindu B., et al.
Veröffentlicht: (2025)

Cost-Performance Analysis: A Comparative Study of CPU-Based Serverless and GPU-Based Training Architectures
von: Barrak, Amine, et al.
Veröffentlicht: (2025)

Exploring Uncore Frequency Scaling for Heterogeneous Computing
von: Zheng, Zhong, et al.
Veröffentlicht: (2025)

UMDAM: A Unified Data Layout and DRAM Address Mapping for Heterogenous NPU-PIM
von: Huang, Hai
Veröffentlicht: (2025)

CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM Inference
von: Li, Suyi, et al.
Veröffentlicht: (2024)

Efficient CPU-GPU Collaborative Inference for MoE-based LLMs on Memory-Limited Systems
von: Huang, En-Ming, et al.
Veröffentlicht: (2025)

SiPipe: Bridging the CPU-GPU Utilization Gap for Efficient Pipeline-Parallel LLM Inference
von: He, Yongchao, et al.
Veröffentlicht: (2025)