:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Besta, Maciej, Gerstenberger, Robert, Iff, Patrick, Sonawane, Pournima, Luna, Juan Gómez, Kanakagiri, Raghavendra, Min, Rui, Kwaśniewski, Grzegorz, Mutlu, Onur, Hoefler, Torsten, Appuswamy, Raja, Mahony, Aidan O
Natura:	Preprint
Pubblicazione:	2024
Soggetti:	Information Retrieval Performance
Accesso online:	https://arxiv.org/abs/2408.12173
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

FoldedHexaTorus: An Inter-Chiplet Interconnect Topology for Chiplet-based Systems using Organic and Glass Substrates
di: Iff, Patrick, et al.
Pubblicazione: (2025)

EvalNet: A Practical Toolchain for Generation and Analysis of Extreme-Scale Interconnects
di: Besta, Maciej, et al.
Pubblicazione: (2021)

Inductive Loop Analysis for Practical HPC Application Optimization
di: Schaad, Philipp, et al.
Pubblicazione: (2025)

Minimum Cost Loop Nests for Contraction of a Sparse Tensor with a Tensor Network
di: Kanakagiri, Raghavendra, et al.
Pubblicazione: (2023)

Demystifying Chains, Trees, and Graphs of Thoughts
di: Besta, Maciej, et al.
Pubblicazione: (2024)

PlaceIT: Placement-based Inter-Chiplet Interconnect Topologies
di: Iff, Patrick, et al.
Pubblicazione: (2025)

Network Design for Wafer-Scale Systems with Wafer-on-Wafer Hybrid Bonding
di: Iff, Patrick, et al.
Pubblicazione: (2026)

Near-Optimal Wafer-Scale Reduce
di: Luczynski, Piotr, et al.
Pubblicazione: (2024)

FPsPIN: An FPGA-based Open-Hardware Research Platform for Processing in the Network
di: Schneider, Timo, et al.
Pubblicazione: (2024)

High Performance Unstructured SpMM Computation Using Tensor Cores
di: Okanovic, Patrik, et al.
Pubblicazione: (2024)

Higher-Order Graph Databases
di: Besta, Maciej, et al.
Pubblicazione: (2025)

Demystifying Higher-Order Graph Neural Networks
di: Besta, Maciej, et al.
Pubblicazione: (2024)

Benchmarking Filtered Approximate Nearest Neighbor Search Algorithms on Transformer-based Embedding Vectors
di: Iff, Patrick, et al.
Pubblicazione: (2025)

RapidChiplet: A Toolchain for Rapid Design Space Exploration of Chiplet Architectures
di: Iff, Patrick, et al.
Pubblicazione: (2023)

Iterating Pointers: Enabling Static Analysis for Loop-based Pointers
di: Lepori, Andrea, et al.
Pubblicazione: (2025)

A Priori Loop Nest Normalization: Automatic Loop Scheduling in Complex Applications
di: Trümper, Lukas, et al.
Pubblicazione: (2024)

DaCe AD: Unifying High-Performance Automatic Differentiation for Machine Learning and Scientific Computing
di: Boudaoud, Afif, et al.
Pubblicazione: (2025)

Affordable AI Assistants with Knowledge Graph of Thoughts
di: Besta, Maciej, et al.
Pubblicazione: (2025)

Confidential LLM Inference: Performance and Cost Across CPU and GPU TEEs
di: Chrapek, Marcin, et al.
Pubblicazione: (2025)

EDAN: Towards Understanding Memory Parallelism and Latency Sensitivity in HPC
di: Shen, Siyuan, et al.
Pubblicazione: (2025)

CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks
di: Besta, Maciej, et al.
Pubblicazione: (2024)

PICO: Performance Insights for Collective Operations
di: Pasqualoni, Saverio, et al.
Pubblicazione: (2025)

PerfDojo: Automated ML Library Generation for Heterogeneous Architectures
di: Ivanov, Andrei, et al.
Pubblicazione: (2025)

Psychologically Enhanced AI Agents
di: Besta, Maciej, et al.
Pubblicazione: (2025)

Cleaning up the Mess: Re-Evaluating the Real-System Modeling Accuracy of Ramulator 2.0
di: Bostanci, F. Nisa, et al.
Pubblicazione: (2025)

Assessing the Performance of OpenTitan as Cryptographic Accelerator in Secure Open-Hardware System-on-Chips
di: Parisi, Emanuele, et al.
Pubblicazione: (2024)

Reasoning Language Models: A Blueprint
di: Besta, Maciej, et al.
Pubblicazione: (2025)

Denoising Application Performance Models with Noise-Resilient Priors
di: de Morais, Gustavo, et al.
Pubblicazione: (2025)

PyGim: An Efficient Graph Neural Network Library for Real Processing-In-Memory Architectures
di: Giannoula, Christina, et al.
Pubblicazione: (2024)

In-Network Collective Operations: Game Changer or Challenge for AI Workloads?
di: Hoefler, Torsten, et al.
Pubblicazione: (2026)

Hardware-Agnostic and Insightful Efficiency Metrics for Accelerated Systems: Definition and Implementation within TALP
di: Rahimi, Ghazal, et al.
Pubblicazione: (2026)

ADELIA: Automatic Differentiation for Efficient Laplace Inference Approximations
di: Boudaoud, Afif, et al.
Pubblicazione: (2026)

DCC: Data-Centric Compilation of Machine Learning Kernels for Processing-In-Memory Architectures
di: Yang, Peiming, et al.
Pubblicazione: (2025)

BLaST: High Performance Inference and Pretraining using BLock Sparse Transformers
di: Okanovic, Patrik, et al.
Pubblicazione: (2025)

Swing: Short-cutting Rings for Higher Bandwidth Allreduce
di: De Sensi, Daniele, et al.
Pubblicazione: (2024)

Multi-Strided Access Patterns to Boost Hardware Prefetching
di: Blom, Miguel O., et al.
Pubblicazione: (2024)

KForge: Program Synthesis for Diverse AI Hardware Accelerators
di: Sereda, Taras, et al.
Pubblicazione: (2025)

Solving Combinatorial Optimization Problems on a Photonic Quantum Computer
di: Slysz, Mateusz, et al.
Pubblicazione: (2024)

Performance-Driven Optimization of Parallel Breadth-First Search
di: Bhaskar, Marati, et al.
Pubblicazione: (2025)

Long-term Monitoring of Kernel and Hardware Events to Understand Latency Variance
di: Zhou, Fang, et al.
Pubblicazione: (2026)