:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Schwartz, Eli, Choshen, Leshem, Shtok, Joseph, Doveh, Sivan, Karlinsky, Leonid, Arbelle, Assaf
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2404.00459
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Augmenting In-Context-Learning in LLMs via Automatic Data Labeling and Refinement
by: Shtok, Joseph, et al.
Published: (2024)

MAEDAY: MAE for few and zero shot AnomalY-Detection
by: Schwartz, Eli, et al.
Published: (2022)

Towards Multimodal In-Context Learning for Vision & Language Models
by: Doveh, Sivan, et al.
Published: (2024)

Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning
by: Huang, Brandon, et al.
Published: (2024)

LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content
by: Shabtay, Nimrod, et al.
Published: (2024)

Teaching VLMs to Localize Specific Objects from In-context Examples
by: Doveh, Sivan, et al.
Published: (2024)

Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features
by: Mitra, Chancharik, et al.
Published: (2024)

ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs
by: Huang, Irene, et al.
Published: (2024)

Can Gradient Descent Simulate Prompting?
by: Zhang, Eric, et al.
Published: (2025)

The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community
by: Don-Yehiya, Shachar, et al.
Published: (2024)

Naturally Occurring Feedback is Common, Extractable and Useful
by: Don-Yehiya, Shachar, et al.
Published: (2024)

A Hitchhiker's Guide to Scaling Law Estimation
by: Choshen, Leshem, et al.
Published: (2024)

Fuse to Forget: Bias Reduction and Selective Memorization through Model Fusion
by: Zaman, Kerem, et al.
Published: (2023)

Beneath the Surface of Consistency: Exploring Cross-lingual Knowledge Representation Sharing in LLMs
by: Ifergan, Maxim, et al.
Published: (2024)

Do LLMs Benefit From Their Own Words?
by: Huang, Jenny Y., et al.
Published: (2026)

Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs
by: Mirza, M. Jehanzeb, et al.
Published: (2024)

Instructions Shape Production of Language, not Processing
by: Waldis, Andreas, et al.
Published: (2026)

Jump to Conclusions: Short-Cutting Transformers With Linear Transformations
by: Din, Alexander Yom, et al.
Published: (2023)

Mediocrity is the key for LLM as a Judge Anchor Selection
by: Don-Yehiya, Shachar, et al.
Published: (2026)

Comparison Visual Instruction Tuning
by: Lin, Wei, et al.
Published: (2024)

CRISP: Complex Reasoning with Interpretable Step-based Plans
by: Vetzler, Matan, et al.
Published: (2025)

ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
by: Yadav, Prateek, et al.
Published: (2023)

tinyBenchmarks: evaluating LLMs with fewer examples
by: Polo, Felipe Maia, et al.
Published: (2024)

Holmes: A Benchmark to Assess the Linguistic Competence of Language Models
by: Waldis, Andreas, et al.
Published: (2024)

Pretraining Language Models for Diachronic Linguistic Change Discovery
by: Fittschen, Elisabeth, et al.
Published: (2025)

Genie: Achieving Human Parity in Content-Grounded Datasets Generation
by: Yehudai, Asaf, et al.
Published: (2024)

Deductive Closure Training of Language Models for Coherence, Accuracy, and Updatability
by: Akyürek, Afra Feyza, et al.
Published: (2024)

Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty
by: Damani, Mehul, et al.
Published: (2025)

LLM Hypnosis: Exploiting User Feedback for Unauthorized Knowledge Injection to All Users
by: Hilel, Almog, et al.
Published: (2025)

Will it Merge? On The Causes of Model Mergeability
by: Rahamim, Adir, et al.
Published: (2026)

Resolving Interference (RI): Disentangling Models for Improved Model Merging
by: Ramesh, Pratik, et al.
Published: (2026)

ErrorMap and ErrorAtlas: Charting the Failure Landscape of Large Language Models
by: Ashury-Tahan, Shir, et al.
Published: (2026)

Robustness as an Emergent Property of Task Performance
by: Ashury-Tahan, Shir, et al.
Published: (2026)

Label-Efficient Model Selection for Text Generation
by: Ashury-Tahan, Shir, et al.
Published: (2024)

TextArena
by: Guertler, Leon, et al.
Published: (2025)

The Mighty ToRR: A Benchmark for Table Reasoning and Robustness
by: Ashury-Tahan, Shir, et al.
Published: (2025)

Efficient multi-prompt evaluation of LLMs
by: Polo, Felipe Maia, et al.
Published: (2024)

ChartGen: Scaling Chart Understanding Via Code-Guided Synthetic Chart Generation
by: Kondic, Jovana, et al.
Published: (2025)

Do These LLM Benchmarks Agree? Fixing Benchmark Evaluation with BenchBench
by: Perlitz, Yotam, et al.
Published: (2024)

DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation
by: Habba, Eliya, et al.
Published: (2025)