Guardado en:
| Autores principales: | Wang, Quandong, Yuan, Yuxuan, Yang, Xiaoyu, Zhang, Ruike, Zhao, Kang, Liu, Wei, Luan, Jian, Povey, Daniel, Wang, Bin |
|---|---|
| Formato: | Preprint |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2406.06571 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Toward Architecture-Aware Evaluation Metrics for LLM Agents
por: Souza, Débora, et al.
Publicado: (2026)
por: Souza, Débora, et al.
Publicado: (2026)
FlexQuant: A Flexible and Efficient Dynamic Precision Switching Framework for LLM Quantization
por: Liu, Fangxin, et al.
Publicado: (2025)
por: Liu, Fangxin, et al.
Publicado: (2025)
End-to-End Optimization of LLM-Driven Multi-Agent Search Systems via Heterogeneous-Group-Based Reinforcement Learning
por: Chen, Guanzhong, et al.
Publicado: (2025)
por: Chen, Guanzhong, et al.
Publicado: (2025)
Blocks Architecture (BloArk): Efficient, Cost-Effective, and Incremental Dataset Architecture for Wikipedia Revision History
por: Li, Lingxi, et al.
Publicado: (2024)
por: Li, Lingxi, et al.
Publicado: (2024)
HumanLLM: Benchmarking and Improving LLM Anthropomorphism via Human Cognitive Patterns
por: Wang, Xintao, et al.
Publicado: (2026)
por: Wang, Xintao, et al.
Publicado: (2026)
AsyncTLS: Efficient Generative LLM Inference with Asynchronous Two-level Sparse Attention
por: Hu, Yuxuan, et al.
Publicado: (2026)
por: Hu, Yuxuan, et al.
Publicado: (2026)
XPath Agent: An Efficient XPath Programming Agent Based on LLM for Web Crawler
por: Li, Yu, et al.
Publicado: (2024)
por: Li, Yu, et al.
Publicado: (2024)
LoRS: Efficient Low-Rank Adaptation for Sparse Large Language Model
por: Hu, Yuxuan, et al.
Publicado: (2025)
por: Hu, Yuxuan, et al.
Publicado: (2025)
QUAD: Quantization and Parameter-Efficient Tuning of LLM with Activation Decomposition
por: Hu, Yuxuan, et al.
Publicado: (2025)
por: Hu, Yuxuan, et al.
Publicado: (2025)
Evaluating the efficacy of LLM Safety Solutions : The Palit Benchmark Dataset
por: Palit, Sayon, et al.
Publicado: (2025)
por: Palit, Sayon, et al.
Publicado: (2025)
ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning
por: Mi, Zhendong, et al.
Publicado: (2025)
por: Mi, Zhendong, et al.
Publicado: (2025)
Beyond Token Length: Step Pruner for Efficient and Accurate Reasoning in Large Language Models
por: Wu, Canhui, et al.
Publicado: (2025)
por: Wu, Canhui, et al.
Publicado: (2025)
TokenStack: A Heterogeneous HBM-PIM Architecture and Runtime for Efficient LLM Inference
por: Li, Zhuoran, et al.
Publicado: (2026)
por: Li, Zhuoran, et al.
Publicado: (2026)
Just Pass Twice: Efficient Token Classification with LLMs for Zero-Shot NER
por: Ewais, Ahmed, et al.
Publicado: (2026)
por: Ewais, Ahmed, et al.
Publicado: (2026)
Efficient Knowledge Feeding to Language Models: A Novel Integrated Encoder-Decoder Architecture
por: Kumar, S Santosh, et al.
Publicado: (2025)
por: Kumar, S Santosh, et al.
Publicado: (2025)
LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models
por: Pitorro, Hugo, et al.
Publicado: (2025)
por: Pitorro, Hugo, et al.
Publicado: (2025)
Steering Language Models in Multi-Token Generation: A Case Study on Tense and Aspect
por: Klerings, Alina, et al.
Publicado: (2025)
por: Klerings, Alina, et al.
Publicado: (2025)
PARNESS: A Paper Harness for End-to-End Automated Scientific Research with Dynamic Workflows, Full-Text Indexing, and Cross-Run Knowledge Accumulation
por: Wang, Yuchen, et al.
Publicado: (2026)
por: Wang, Yuchen, et al.
Publicado: (2026)
From Pixels to Privacy: Temporally Consistent Video Anonymization via Token Pruning for Privacy Preserving Action Recognition
por: Aslam, Nazia, et al.
Publicado: (2026)
por: Aslam, Nazia, et al.
Publicado: (2026)
LLM-GLOBE: A Benchmark Evaluating the Cultural Values Embedded in LLM Output
por: Karinshak, Elise, et al.
Publicado: (2024)
por: Karinshak, Elise, et al.
Publicado: (2024)
Random Heterogeneous Neurochaos Learning Architecture for Data Classification
por: S, Remya Ajai A, et al.
Publicado: (2024)
por: S, Remya Ajai A, et al.
Publicado: (2024)
CRISP: Persistent Concept Unlearning via Sparse Autoencoders
por: Ashuach, Tomer, et al.
Publicado: (2025)
por: Ashuach, Tomer, et al.
Publicado: (2025)
Good to Go: The LOOP Skill Engine That Hits 99% Success and Slashes Token Usage by 99% via One-Shot Recording and Deterministic Replay
por: Wang, Xiaohua, et al.
Publicado: (2026)
por: Wang, Xiaohua, et al.
Publicado: (2026)
All for One: LLMs Solve Mental Math at the Last Token With Information Transferred From Other Tokens
por: Mamidanna, Siddarth, et al.
Publicado: (2025)
por: Mamidanna, Siddarth, et al.
Publicado: (2025)
Steer-MoE: Efficient Audio-Language Alignment with a Mixture-of-Experts Steering Module
por: Feng, Ruitao, et al.
Publicado: (2025)
por: Feng, Ruitao, et al.
Publicado: (2025)
LLM Unlearning on Noisy Forget Sets: A Study of Incomplete, Rewritten, and Watermarked Data
por: Wang, Changsheng, et al.
Publicado: (2025)
por: Wang, Changsheng, et al.
Publicado: (2025)
FastForward Pruning: Efficient LLM Pruning via Single-Step Reinforcement Learning
por: Yuan, Xin, et al.
Publicado: (2025)
por: Yuan, Xin, et al.
Publicado: (2025)
Exploiting Pre-trained Encoder-Decoder Transformers for Sequence-to-Sequence Constituent Parsing
por: Fernández-González, Daniel, et al.
Publicado: (2026)
por: Fernández-González, Daniel, et al.
Publicado: (2026)
Xinyu: An Efficient LLM-based System for Commentary Generation
por: Wu, Yiquan, et al.
Publicado: (2024)
por: Wu, Yiquan, et al.
Publicado: (2024)
QuickSilver -- Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization
por: Khanna, Danush, et al.
Publicado: (2025)
por: Khanna, Danush, et al.
Publicado: (2025)
Teams of LLM Agents can Exploit Zero-Day Vulnerabilities
por: Zhu, Yuxuan, et al.
Publicado: (2024)
por: Zhu, Yuxuan, et al.
Publicado: (2024)
SAGE: Hierarchical LLM-Based Literary Evaluation through Ontology-Grounded Interpretive Dimensions
por: Wang, Tianyu, et al.
Publicado: (2026)
por: Wang, Tianyu, et al.
Publicado: (2026)
Comparative Study of Large Language Models on Chinese Film Script Continuation: An Empirical Analysis Based on GPT-5.2 and Qwen-Max
por: Cao, Yuxuan, et al.
Publicado: (2026)
por: Cao, Yuxuan, et al.
Publicado: (2026)
Large Language Model (LLM) Bias Index -- LLMBI
por: Oketunji, Abiodun Finbarrs, et al.
Publicado: (2023)
por: Oketunji, Abiodun Finbarrs, et al.
Publicado: (2023)
Efficient LLM Safety Evaluation through Multi-Agent Debate
por: Lin, Dachuan, et al.
Publicado: (2025)
por: Lin, Dachuan, et al.
Publicado: (2025)
PromptSAM+: Malware Detection based on Prompt Segment Anything Model
por: Wei, Xingyuan, et al.
Publicado: (2024)
por: Wei, Xingyuan, et al.
Publicado: (2024)
PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing
por: Deng, Cheng, et al.
Publicado: (2025)
por: Deng, Cheng, et al.
Publicado: (2025)
HyDRA: Hybrid Dynamic Routing Architecture for Heterogeneous LLM Pools
por: Garg, Aashna, et al.
Publicado: (2026)
por: Garg, Aashna, et al.
Publicado: (2026)
Kronecker Embeddings: Byte-Level Structured Token Representations for Parameter-Efficient Language Models
por: Shravan, Rohan
Publicado: (2026)
por: Shravan, Rohan
Publicado: (2026)
SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models
por: Zhang, Yuxuan
Publicado: (2025)
por: Zhang, Yuxuan
Publicado: (2025)
Ejemplares similares
-
Toward Architecture-Aware Evaluation Metrics for LLM Agents
por: Souza, Débora, et al.
Publicado: (2026) -
FlexQuant: A Flexible and Efficient Dynamic Precision Switching Framework for LLM Quantization
por: Liu, Fangxin, et al.
Publicado: (2025) -
End-to-End Optimization of LLM-Driven Multi-Agent Search Systems via Heterogeneous-Group-Based Reinforcement Learning
por: Chen, Guanzhong, et al.
Publicado: (2025) -
Blocks Architecture (BloArk): Efficient, Cost-Effective, and Incremental Dataset Architecture for Wikipedia Revision History
por: Li, Lingxi, et al.
Publicado: (2024) -
HumanLLM: Benchmarking and Improving LLM Anthropomorphism via Human Cognitive Patterns
por: Wang, Xintao, et al.
Publicado: (2026)