:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Novikov, Georgii, Oseledets, Ivan
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2407.15545
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Tensor-Train Point Cloud Compression and Efficient Approximate Nearest-Neighbor Search
by: Novikov, Georgii, et al.
Published: (2024)

Quasi-Random Physics-informed Neural Networks
by: Yu, Tianchi, et al.
Published: (2025)

Spectral Informed Neural Network: An Efficient and Low-Memory PINN
by: Yu, Tianchi, et al.
Published: (2024)

RECE: Reduced Cross-Entropy Loss for Large-Catalogue Sequential Recommenders
by: Gusak, Danil, et al.
Published: (2024)

Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts
by: Sivtsov, Danil, et al.
Published: (2025)

Linearly Constrained Weights: Reducing Activation Shift for Faster Training of Neural Networks
by: Kutsuna, Takuro
Published: (2024)

On the Spatial Structure of Mixture-of-Experts in Transformers
by: Bershatsky, Daniel, et al.
Published: (2025)

Exploring the Hidden Capacity of LLMs for One-Step Text Generation
by: Mezentsev, Gleb, et al.
Published: (2025)

Binding threshold units with artificial oscillatory neurons
by: Fanaskov, Vladimir, et al.
Published: (2025)

MLPMoE: Zero-Shot Architectural Metamorphosis of Dense LLM MLPs into Static Mixture-of-Experts
by: Novikov, Ivan
Published: (2025)

Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition
by: Basharin, Artem, et al.
Published: (2024)

Run LoRA Run: Faster and Lighter LoRA Implementations
by: Cherniuk, Daria, et al.
Published: (2023)

FreshGNN: Reducing Memory Access via Stable Historical Embeddings for Graph Neural Network Training
by: Huang, Kezhao, et al.
Published: (2023)

The Rogue Scalpel: Activation Steering Compromises LLM Safety
by: Korznikov, Anton, et al.
Published: (2025)

Monitoring Neural Training with Topology: A Footprint-Predictable Collapse Index
by: Kalinowski, Alexander
Published: (2026)

Learning from Linear Algebra: A Graph Neural Network Approach to Preconditioner Design for Conjugate Gradient Solvers
by: Trifonov, Vladislav, et al.
Published: (2024)

Spectral Analysis of the Weighted Frobenius Objective
by: Trifonov, Vladislav, et al.
Published: (2025)

Bayesian Inverse Problems Meet Flow Matching: Efficient and Flexible Inference via Transformers
by: Sherki, Daniil, et al.
Published: (2025)

Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Training
by: Malhotra, Akul, et al.
Published: (2024)

Reducing Smoothness with Expressive Memory Enhanced Hierarchical Graph Neural Networks
by: Bailie, Thomas, et al.
Published: (2025)

ConDiff: A Challenging Dataset for Neural Solvers of Partial Differential Equations
by: Trifonov, Vladislav, et al.
Published: (2024)

Explicit Flow Matching: On The Theory of Flow Matching Algorithms with Applications
by: Ryzhakov, Gleb, et al.
Published: (2024)

MaxInfo: A Training-Free Key-Frame Selection Method Using Maximum Volume for Enhanced Video Understanding
by: Li, Pengyi, et al.
Published: (2025)

Message-Passing GNNs Fail to Approximate Sparse Triangular Factorizations
by: Trifonov, Vladislav, et al.
Published: (2025)

DNN Memory Footprint Reduction via Post-Training Intra-Layer Multi-Precision Quantization
by: Ghavami, Behnam, et al.
Published: (2024)

Topology-based Representative Datasets to Reduce Neural Network Training Resources
by: Gonzalez-Diaz, Rocio, et al.
Published: (2019)

Framework GNN-AID: Graph Neural Network Analysis Interpretation and Defense
by: Lukyanov, Kirill, et al.
Published: (2025)

A case study of spatiotemporal forecasting techniques for weather forecasting
by: Sofi, Shakir Showkat, et al.
Published: (2022)

Sparse and Transferable Universal Singular Vectors Attack
by: Kuvshinova, Kseniia, et al.
Published: (2024)

Inverting Non-Injective Functions with Twin Neural Network Regression
by: Wetzel, Sebastian J.
Published: (2026)

Black-Box Approximation and Optimization with Hierarchical Tucker Decomposition
by: Ryzhakov, Gleb, et al.
Published: (2024)

Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item Catalogs
by: Mezentsev, Gleb, et al.
Published: (2024)

Back to Basics: Revisiting Exploration in Reinforcement Learning for LLM Reasoning via Generative Probabilities
by: Li, Pengyi, et al.
Published: (2026)

NNTile: a machine learning framework capable of training extremely large GPT language models on a single node
by: Mikhalev, Aleksandr, et al.
Published: (2025)

Training Memory in Deep Neural Networks: Mechanisms, Evidence, and Measurement Gaps
by: Sevetlidis, Vasileios, et al.
Published: (2026)

Marchuk: Efficient Global Weather Forecasting from Mid-Range to Sub-Seasonal Scales via Flow Matching
by: Kuzhamuratov, Arsen, et al.
Published: (2026)

OASIS: Online Activation Subspace Learning for Memory-Efficient Training
by: Choudhary, Sakshi, et al.
Published: (2026)

FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training
by: Zmushko, Philip, et al.
Published: (2024)

OUI as a Structural Observable: Towards an Activation-Centric View of Neural Network Training
by: Fernández-Hernández, Alberto, et al.
Published: (2026)

Semiring Activation in Neural Networks
by: Smets, Bart M. N., et al.
Published: (2024)