:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ma, Xuezhe, Yang, Xiaomeng, Xiong, Wenhan, Chen, Beidi, Yu, Lili, Zhang, Hao, May, Jonathan, Zettlemoyer, Luke, Levy, Omer, Zhou, Chunting
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Computation and Language
Online Access:	https://arxiv.org/abs/2404.08801
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
by: Zhou, Chunting, et al.
Published: (2024)

LMFusion: Adapting Pretrained Language Models for Multimodal Generation
by: Shi, Weijia, et al.
Published: (2024)

Self-Alignment with Instruction Backtranslation
by: Li, Xian, et al.
Published: (2023)

Gecko: An Efficient Neural Architecture Inherently Processing Sequences with Arbitrary Lengths
by: Ma, Xuezhe, et al.
Published: (2026)

CAT: Content-Adaptive Image Tokenization
by: Shen, Junhong, et al.
Published: (2025)

ALMA: Alignment with Minimal Annotation
by: Yasunaga, Michihiro, et al.
Published: (2024)

In-context Pretraining: Language Modeling Beyond Document Boundaries
by: Shi, Weijia, et al.
Published: (2023)

LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
by: Xu, Nan, et al.
Published: (2024)

Towards Chapter-to-Chapter Context-Aware Literary Translation via Large Language Models
by: Jin, Linghao, et al.
Published: (2024)

SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices
by: Svirschevski, Ruslan, et al.
Published: (2024)

Modeling Community Attitude through Reaction Tone: A Human-AI Collaborative Framework for Evaluating LLM Alignment with Linguistic Behaviors in Online Communities
by: Wen, Nuan, et al.
Published: (2026)

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
by: Liang, Weixin, et al.
Published: (2024)

GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?
by: Zhou, Yang, et al.
Published: (2025)

Efficient Pretraining Length Scaling
by: Wu, Bohong, et al.
Published: (2025)

Beyond Length: Quantifying Long-Range Information for Long-Context LLM Pretraining Data
by: Deng, Haoran, et al.
Published: (2025)

Craw4LLM: Efficient Web Crawling for LLM Pretraining
by: Yu, Shi, et al.
Published: (2025)

Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity
by: Liang, Weixin, et al.
Published: (2025)

Prompt-prompted Adaptive Structured Pruning for Efficient LLM Generation
by: Dong, Harry, et al.
Published: (2024)

DecoPrompt : Decoding Prompts Reduces Hallucinations when Large Language Models Meet False Premises
by: Xu, Nan, et al.
Published: (2024)

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
by: Dong, Harry, et al.
Published: (2024)

Byte Latent Transformer: Patches Scale Better Than Tokens
by: Pagnoni, Artidoro, et al.
Published: (2024)

Megalodon, mako shark and planktonic foraminifera from the continental shelf off Portugal and their age
by: M.T. ANTUNES
Published: (2015)

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
by: Ren, Liliang, et al.
Published: (2024)

Squeezed Attention: Accelerating Long Context Length LLM Inference
by: Hooper, Coleman, et al.
Published: (2024)

LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models
by: Han, Chi, et al.
Published: (2023)

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
by: Sun, Hanshi, et al.
Published: (2024)

Art Unlimited?
by: Schultheis, Franz, et al.
Published: (2016)

Detecting Pretraining Data from Large Language Models
by: Shi, Weijia, et al.
Published: (2023)

HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
by: Luo, Cheng, et al.
Published: (2025)

Bootstrapping LLM Robustness for VLM Safety via Reducing the Pretraining Modality Gap
by: Yang, Wenhan, et al.
Published: (2025)

Keep Guessing? When Considering Inference Scaling, Mind the Baselines
by: Yona, Gal, et al.
Published: (2024)

MALI: Unlimited Mandate
Published: (2025)

Learning Center Unlimited.
by: Vivrette, Lyndon
Published: (1974)

Computational Tradeoffs in Image Synthesis: Diffusion, Masked-Token, and Next-Token Prediction
by: Kilian, Maciej, et al.
Published: (2024)

Comparing Hallucination Detection Metrics for Multilingual Generation
by: Kang, Haoqiang, et al.
Published: (2024)

Multimodal RewardBench: Holistic Evaluation of Reward Models for Vision Language Models
by: Yasunaga, Michihiro, et al.
Published: (2025)

(Mis)Fitting: A Survey of Scaling Laws
by: Li, Margaret, et al.
Published: (2025)

PatentEdits: Framing Patent Novelty as Textual Entailment
by: Lee, Ryan, et al.
Published: (2024)

APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding
by: Yang, Xinyu, et al.
Published: (2025)

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
by: Qin, Zhen, et al.
Published: (2024)