:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sanyal, Arnab, Datta, Gourav, Mukherjee, Prithwish, Chinchali, Sandeep P., Orshansky, Michael
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2505.02380
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

LE-NeuS: Latency-Efficient Neuro-Symbolic Video Understanding via Adaptive Temporal Verification
by: Liang, Shawn, et al.
Published: (2026)

OASIS: Optimized Lightweight Autoencoder System for Distributed In-Sensor computing
by: Zhou, Chengwei, et al.
Published: (2025)

EntroCut: Entropy-Guided Adaptive Truncation for Efficient Chain-of-Thought Reasoning in Small-scale Large Reasoning Models
by: Yan, Hongxi, et al.
Published: (2026)

Exploiting Distribution Constraints for Scalable and Efficient Image Retrieval
by: Omama, Mohammad, et al.
Published: (2024)

EntroGD: Scalable Generalized Deduplication for Efficient Direct Analytics on Compressed IoT Data
by: Zhao, Xiaobo, et al.
Published: (2025)

MoS-VLA: A Vision-Language-Action Model with One-Shot Skill Adaptation
by: Zhao, Ruihan, et al.
Published: (2025)

TensorCommitments: A Lightweight Verifiable Inference for Language Models
by: Baser, Oguzhan, et al.
Published: (2026)

EntroCoT: Enhancing Chain-of-Thought via Adaptive Entropy-Guided Segmentation
by: Li, Zihang, et al.
Published: (2026)

SecureSpectra: Safeguarding Digital Identity from Deep Fake Threats via Intelligent Signatures
by: Baser, Oguzhan, et al.
Published: (2024)

TinyLLM: Evaluation and Optimization of Small Language Models for Agentic Tasks on Edge Devices
by: Haque, Mohd Ariful, et al.
Published: (2025)

Hermes: Memory-Efficient Pipeline Inference for Large Models on Edge Devices
by: Han, Xueyuan, et al.
Published: (2024)

EntroAD: Structural Entropy-Guided Prompt Adaptation for Zero-Shot Anomaly Detection
by: Zhao, Xinyu, et al.
Published: (2026)

PhonemeFake: Redefining Deepfake Realism with Language-Driven Segmental Manipulation and Adaptive Bilevel Detection
by: Baser, Oguzhan, et al.
Published: (2025)

Learning Scalable Temporal Representations in Spiking Neural Networks Without Labels
by: Zhou, Chengwei, et al.
Published: (2025)

ReDistill: Residual Encoded Distillation for Peak Memory Reduction of CNNs
by: Chen, Fang, et al.
Published: (2024)

Safe Networked Robotics with Probabilistic Verification
by: Narasimhan, Sai Shankar, et al.
Published: (2023)

CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features
by: Li, Po-han, et al.
Published: (2024)

SlimEdge: Performance and Device Aware Distributed DNN Deployment on Resource-Constrained Edge Hardware
by: Kumar, Mahadev Sunil, et al.
Published: (2025)

Presto: Hardware Acceleration of Ciphers for Hybrid Homomorphic Encryption
by: Jeon, Yeonsoo, et al.
Published: (2025)

EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting
by: Yu, Zhongzhi, et al.
Published: (2024)

On-Device Qwen2.5: Efficient LLM Inference with Model Compression and Hardware Acceleration
by: Xiang, Maoyang, et al.
Published: (2025)

Mixed-Precision Quantization for Deep Vision Models with Integer Quadratic Programming
by: Deng, Zihao, et al.
Published: (2023)

Failure-Resilient Distributed Inference with Model Compression over Heterogeneous Edge Devices
by: Wang, Li, et al.
Published: (2024)

Peek2: Regex-free Byte-level Byte-Pair Encoding Pretokenizer for LLM Inference on Edge Devices
by: Zai, Liu, et al.
Published: (2026)

EntroLnn: Entropy-Guided Liquid Neural Networks for Operando Refinement of Battery Capacity Fade Trajectories
by: Li, Wei, et al.
Published: (2026)

GELATO: Generative Entropy- and Lyapunov-based Adaptive Token Offloading for Device-Edge Speculative LLM Inference
by: Tang, Zengzipeng, et al.
Published: (2026)

SynDiff-AD: Improving Semantic Segmentation and End-to-End Autonomous Driving with Synthetic Data from Latent Diffusion Models
by: Goel, Harsh, et al.
Published: (2024)

Dynamic Compressing Prompts for Efficient Inference of Large Language Models
by: Hu, Jinwu, et al.
Published: (2025)

Human-Agent Coordination in Games under Incomplete Information via Multi-Step Intent
by: Chen, Shenghui, et al.
Published: (2024)

IG-MCTS: Human-in-the-Loop Cooperative Navigation under Incomplete Information
by: Chen, Shenghui, et al.
Published: (2025)

Model Compression and Efficient Inference for Large Language Models: A Survey
by: Wang, Wenxiao, et al.
Published: (2024)

EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control
by: Yang, Kai, et al.
Published: (2025)

We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
by: Choi, Minkyu, et al.
Published: (2025)

Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification
by: Sharan, S P, et al.
Published: (2024)

BladderFormer: A Streaming Transformer for Real-Time Urological State Monitoring
by: Zhou, Chengwei, et al.
Published: (2025)

Rethinking Vision Transformer Depth via Structural Reparameterization
by: Zhou, Chengwei, et al.
Published: (2025)

HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices
by: HyperAI Team, et al.
Published: (2025)

EdgeFM: Efficient Edge Inference for Vision-Language Models
by: Deng, Mengling, et al.
Published: (2026)

Time Weaver: A Conditional Time Series Generation Model
by: Narasimhan, Sai Shankar, et al.
Published: (2024)

Designing Efficient LLM Accelerators for Edge Devices
by: Haris, Jude, et al.
Published: (2024)