Saved in:
| Main Authors: | Sanyal, Arnab, Datta, Gourav, Mukherjee, Prithwish, Chinchali, Sandeep P., Orshansky, Michael |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.02380 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LE-NeuS: Latency-Efficient Neuro-Symbolic Video Understanding via Adaptive Temporal Verification
by: Liang, Shawn, et al.
Published: (2026)
by: Liang, Shawn, et al.
Published: (2026)
OASIS: Optimized Lightweight Autoencoder System for Distributed In-Sensor computing
by: Zhou, Chengwei, et al.
Published: (2025)
by: Zhou, Chengwei, et al.
Published: (2025)
EntroCut: Entropy-Guided Adaptive Truncation for Efficient Chain-of-Thought Reasoning in Small-scale Large Reasoning Models
by: Yan, Hongxi, et al.
Published: (2026)
by: Yan, Hongxi, et al.
Published: (2026)
Exploiting Distribution Constraints for Scalable and Efficient Image Retrieval
by: Omama, Mohammad, et al.
Published: (2024)
by: Omama, Mohammad, et al.
Published: (2024)
EntroGD: Scalable Generalized Deduplication for Efficient Direct Analytics on Compressed IoT Data
by: Zhao, Xiaobo, et al.
Published: (2025)
by: Zhao, Xiaobo, et al.
Published: (2025)
MoS-VLA: A Vision-Language-Action Model with One-Shot Skill Adaptation
by: Zhao, Ruihan, et al.
Published: (2025)
by: Zhao, Ruihan, et al.
Published: (2025)
TensorCommitments: A Lightweight Verifiable Inference for Language Models
by: Baser, Oguzhan, et al.
Published: (2026)
by: Baser, Oguzhan, et al.
Published: (2026)
EntroCoT: Enhancing Chain-of-Thought via Adaptive Entropy-Guided Segmentation
by: Li, Zihang, et al.
Published: (2026)
by: Li, Zihang, et al.
Published: (2026)
SecureSpectra: Safeguarding Digital Identity from Deep Fake Threats via Intelligent Signatures
by: Baser, Oguzhan, et al.
Published: (2024)
by: Baser, Oguzhan, et al.
Published: (2024)
TinyLLM: Evaluation and Optimization of Small Language Models for Agentic Tasks on Edge Devices
by: Haque, Mohd Ariful, et al.
Published: (2025)
by: Haque, Mohd Ariful, et al.
Published: (2025)
Hermes: Memory-Efficient Pipeline Inference for Large Models on Edge Devices
by: Han, Xueyuan, et al.
Published: (2024)
by: Han, Xueyuan, et al.
Published: (2024)
EntroAD: Structural Entropy-Guided Prompt Adaptation for Zero-Shot Anomaly Detection
by: Zhao, Xinyu, et al.
Published: (2026)
by: Zhao, Xinyu, et al.
Published: (2026)
PhonemeFake: Redefining Deepfake Realism with Language-Driven Segmental Manipulation and Adaptive Bilevel Detection
by: Baser, Oguzhan, et al.
Published: (2025)
by: Baser, Oguzhan, et al.
Published: (2025)
Learning Scalable Temporal Representations in Spiking Neural Networks Without Labels
by: Zhou, Chengwei, et al.
Published: (2025)
by: Zhou, Chengwei, et al.
Published: (2025)
ReDistill: Residual Encoded Distillation for Peak Memory Reduction of CNNs
by: Chen, Fang, et al.
Published: (2024)
by: Chen, Fang, et al.
Published: (2024)
Safe Networked Robotics with Probabilistic Verification
by: Narasimhan, Sai Shankar, et al.
Published: (2023)
by: Narasimhan, Sai Shankar, et al.
Published: (2023)
CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features
by: Li, Po-han, et al.
Published: (2024)
by: Li, Po-han, et al.
Published: (2024)
SlimEdge: Performance and Device Aware Distributed DNN Deployment on Resource-Constrained Edge Hardware
by: Kumar, Mahadev Sunil, et al.
Published: (2025)
by: Kumar, Mahadev Sunil, et al.
Published: (2025)
Presto: Hardware Acceleration of Ciphers for Hybrid Homomorphic Encryption
by: Jeon, Yeonsoo, et al.
Published: (2025)
by: Jeon, Yeonsoo, et al.
Published: (2025)
EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting
by: Yu, Zhongzhi, et al.
Published: (2024)
by: Yu, Zhongzhi, et al.
Published: (2024)
On-Device Qwen2.5: Efficient LLM Inference with Model Compression and Hardware Acceleration
by: Xiang, Maoyang, et al.
Published: (2025)
by: Xiang, Maoyang, et al.
Published: (2025)
Mixed-Precision Quantization for Deep Vision Models with Integer Quadratic Programming
by: Deng, Zihao, et al.
Published: (2023)
by: Deng, Zihao, et al.
Published: (2023)
Failure-Resilient Distributed Inference with Model Compression over Heterogeneous Edge Devices
by: Wang, Li, et al.
Published: (2024)
by: Wang, Li, et al.
Published: (2024)
Peek2: Regex-free Byte-level Byte-Pair Encoding Pretokenizer for LLM Inference on Edge Devices
by: Zai, Liu, et al.
Published: (2026)
by: Zai, Liu, et al.
Published: (2026)
EntroLnn: Entropy-Guided Liquid Neural Networks for Operando Refinement of Battery Capacity Fade Trajectories
by: Li, Wei, et al.
Published: (2026)
by: Li, Wei, et al.
Published: (2026)
GELATO: Generative Entropy- and Lyapunov-based Adaptive Token Offloading for Device-Edge Speculative LLM Inference
by: Tang, Zengzipeng, et al.
Published: (2026)
by: Tang, Zengzipeng, et al.
Published: (2026)
SynDiff-AD: Improving Semantic Segmentation and End-to-End Autonomous Driving with Synthetic Data from Latent Diffusion Models
by: Goel, Harsh, et al.
Published: (2024)
by: Goel, Harsh, et al.
Published: (2024)
Dynamic Compressing Prompts for Efficient Inference of Large Language Models
by: Hu, Jinwu, et al.
Published: (2025)
by: Hu, Jinwu, et al.
Published: (2025)
Human-Agent Coordination in Games under Incomplete Information via Multi-Step Intent
by: Chen, Shenghui, et al.
Published: (2024)
by: Chen, Shenghui, et al.
Published: (2024)
IG-MCTS: Human-in-the-Loop Cooperative Navigation under Incomplete Information
by: Chen, Shenghui, et al.
Published: (2025)
by: Chen, Shenghui, et al.
Published: (2025)
Model Compression and Efficient Inference for Large Language Models: A Survey
by: Wang, Wenxiao, et al.
Published: (2024)
by: Wang, Wenxiao, et al.
Published: (2024)
EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control
by: Yang, Kai, et al.
Published: (2025)
by: Yang, Kai, et al.
Published: (2025)
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
by: Choi, Minkyu, et al.
Published: (2025)
by: Choi, Minkyu, et al.
Published: (2025)
Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification
by: Sharan, S P, et al.
Published: (2024)
by: Sharan, S P, et al.
Published: (2024)
BladderFormer: A Streaming Transformer for Real-Time Urological State Monitoring
by: Zhou, Chengwei, et al.
Published: (2025)
by: Zhou, Chengwei, et al.
Published: (2025)
Rethinking Vision Transformer Depth via Structural Reparameterization
by: Zhou, Chengwei, et al.
Published: (2025)
by: Zhou, Chengwei, et al.
Published: (2025)
HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices
by: HyperAI Team, et al.
Published: (2025)
by: HyperAI Team, et al.
Published: (2025)
EdgeFM: Efficient Edge Inference for Vision-Language Models
by: Deng, Mengling, et al.
Published: (2026)
by: Deng, Mengling, et al.
Published: (2026)
Time Weaver: A Conditional Time Series Generation Model
by: Narasimhan, Sai Shankar, et al.
Published: (2024)
by: Narasimhan, Sai Shankar, et al.
Published: (2024)
Designing Efficient LLM Accelerators for Edge Devices
by: Haris, Jude, et al.
Published: (2024)
by: Haris, Jude, et al.
Published: (2024)
Similar Items
-
LE-NeuS: Latency-Efficient Neuro-Symbolic Video Understanding via Adaptive Temporal Verification
by: Liang, Shawn, et al.
Published: (2026) -
OASIS: Optimized Lightweight Autoencoder System for Distributed In-Sensor computing
by: Zhou, Chengwei, et al.
Published: (2025) -
EntroCut: Entropy-Guided Adaptive Truncation for Efficient Chain-of-Thought Reasoning in Small-scale Large Reasoning Models
by: Yan, Hongxi, et al.
Published: (2026) -
Exploiting Distribution Constraints for Scalable and Efficient Image Retrieval
by: Omama, Mohammad, et al.
Published: (2024) -
EntroGD: Scalable Generalized Deduplication for Efficient Direct Analytics on Compressed IoT Data
by: Zhao, Xiaobo, et al.
Published: (2025)