:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Wang, Quandong, Yuan, Yuxuan, Yang, Xiaoyu, Zhang, Ruike, Zhao, Kang, Liu, Wei, Luan, Jian, Povey, Daniel, Wang, Bin
Formato:	Preprint
Publicado:	2024
Materias:	Computation and Language Artificial Intelligence I.2.7
Acceso en línea:	https://arxiv.org/abs/2406.06571
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

Toward Architecture-Aware Evaluation Metrics for LLM Agents
por: Souza, Débora, et al.
Publicado: (2026)

FlexQuant: A Flexible and Efficient Dynamic Precision Switching Framework for LLM Quantization
por: Liu, Fangxin, et al.
Publicado: (2025)

End-to-End Optimization of LLM-Driven Multi-Agent Search Systems via Heterogeneous-Group-Based Reinforcement Learning
por: Chen, Guanzhong, et al.
Publicado: (2025)

Blocks Architecture (BloArk): Efficient, Cost-Effective, and Incremental Dataset Architecture for Wikipedia Revision History
por: Li, Lingxi, et al.
Publicado: (2024)

HumanLLM: Benchmarking and Improving LLM Anthropomorphism via Human Cognitive Patterns
por: Wang, Xintao, et al.
Publicado: (2026)

AsyncTLS: Efficient Generative LLM Inference with Asynchronous Two-level Sparse Attention
por: Hu, Yuxuan, et al.
Publicado: (2026)

XPath Agent: An Efficient XPath Programming Agent Based on LLM for Web Crawler
por: Li, Yu, et al.
Publicado: (2024)

LoRS: Efficient Low-Rank Adaptation for Sparse Large Language Model
por: Hu, Yuxuan, et al.
Publicado: (2025)

QUAD: Quantization and Parameter-Efficient Tuning of LLM with Activation Decomposition
por: Hu, Yuxuan, et al.
Publicado: (2025)

Evaluating the efficacy of LLM Safety Solutions : The Palit Benchmark Dataset
por: Palit, Sayon, et al.
Publicado: (2025)

ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning
por: Mi, Zhendong, et al.
Publicado: (2025)

Beyond Token Length: Step Pruner for Efficient and Accurate Reasoning in Large Language Models
por: Wu, Canhui, et al.
Publicado: (2025)

TokenStack: A Heterogeneous HBM-PIM Architecture and Runtime for Efficient LLM Inference
por: Li, Zhuoran, et al.
Publicado: (2026)

Just Pass Twice: Efficient Token Classification with LLMs for Zero-Shot NER
por: Ewais, Ahmed, et al.
Publicado: (2026)

Efficient Knowledge Feeding to Language Models: A Novel Integrated Encoder-Decoder Architecture
por: Kumar, S Santosh, et al.
Publicado: (2025)

LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models
por: Pitorro, Hugo, et al.
Publicado: (2025)

Steering Language Models in Multi-Token Generation: A Case Study on Tense and Aspect
por: Klerings, Alina, et al.
Publicado: (2025)

PARNESS: A Paper Harness for End-to-End Automated Scientific Research with Dynamic Workflows, Full-Text Indexing, and Cross-Run Knowledge Accumulation
por: Wang, Yuchen, et al.
Publicado: (2026)

From Pixels to Privacy: Temporally Consistent Video Anonymization via Token Pruning for Privacy Preserving Action Recognition
por: Aslam, Nazia, et al.
Publicado: (2026)

LLM-GLOBE: A Benchmark Evaluating the Cultural Values Embedded in LLM Output
por: Karinshak, Elise, et al.
Publicado: (2024)

Random Heterogeneous Neurochaos Learning Architecture for Data Classification
por: S, Remya Ajai A, et al.
Publicado: (2024)

CRISP: Persistent Concept Unlearning via Sparse Autoencoders
por: Ashuach, Tomer, et al.
Publicado: (2025)

Good to Go: The LOOP Skill Engine That Hits 99% Success and Slashes Token Usage by 99% via One-Shot Recording and Deterministic Replay
por: Wang, Xiaohua, et al.
Publicado: (2026)

All for One: LLMs Solve Mental Math at the Last Token With Information Transferred From Other Tokens
por: Mamidanna, Siddarth, et al.
Publicado: (2025)

Steer-MoE: Efficient Audio-Language Alignment with a Mixture-of-Experts Steering Module
por: Feng, Ruitao, et al.
Publicado: (2025)

LLM Unlearning on Noisy Forget Sets: A Study of Incomplete, Rewritten, and Watermarked Data
por: Wang, Changsheng, et al.
Publicado: (2025)

FastForward Pruning: Efficient LLM Pruning via Single-Step Reinforcement Learning
por: Yuan, Xin, et al.
Publicado: (2025)

Exploiting Pre-trained Encoder-Decoder Transformers for Sequence-to-Sequence Constituent Parsing
por: Fernández-González, Daniel, et al.
Publicado: (2026)

Xinyu: An Efficient LLM-based System for Commentary Generation
por: Wu, Yiquan, et al.
Publicado: (2024)

QuickSilver -- Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization
por: Khanna, Danush, et al.
Publicado: (2025)

Teams of LLM Agents can Exploit Zero-Day Vulnerabilities
por: Zhu, Yuxuan, et al.
Publicado: (2024)

SAGE: Hierarchical LLM-Based Literary Evaluation through Ontology-Grounded Interpretive Dimensions
por: Wang, Tianyu, et al.
Publicado: (2026)

Comparative Study of Large Language Models on Chinese Film Script Continuation: An Empirical Analysis Based on GPT-5.2 and Qwen-Max
por: Cao, Yuxuan, et al.
Publicado: (2026)

Large Language Model (LLM) Bias Index -- LLMBI
por: Oketunji, Abiodun Finbarrs, et al.
Publicado: (2023)

Efficient LLM Safety Evaluation through Multi-Agent Debate
por: Lin, Dachuan, et al.
Publicado: (2025)

PromptSAM+: Malware Detection based on Prompt Segment Anything Model
por: Wei, Xingyuan, et al.
Publicado: (2024)

PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing
por: Deng, Cheng, et al.
Publicado: (2025)

HyDRA: Hybrid Dynamic Routing Architecture for Heterogeneous LLM Pools
por: Garg, Aashna, et al.
Publicado: (2026)

Kronecker Embeddings: Byte-Level Structured Token Representations for Parameter-Efficient Language Models
por: Shravan, Rohan
Publicado: (2026)

SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models
por: Zhang, Yuxuan
Publicado: (2025)