:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Yehudai, Asaf, Carmeli, Boaz, Mass, Yosi, Arviv, Ofir, Mills, Nathaniel, Toledo, Assaf, Shnarch, Eyal, Choshen, Leshem
Natura:	Preprint
Pubblicazione:	2024
Soggetti:	Computation and Language Artificial Intelligence Machine Learning
Accesso online:	https://arxiv.org/abs/2401.14367
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Will it Merge? On The Causes of Model Mergeability
di: Rahamim, Adir, et al.
Pubblicazione: (2026)

Do These LLM Benchmarks Agree? Fixing Benchmark Evaluation with BenchBench
di: Perlitz, Yotam, et al.
Pubblicazione: (2024)

Efficient Benchmarking of Language Models
di: Perlitz, Yotam, et al.
Pubblicazione: (2023)

Mediocrity is the key for LLM as a Judge Anchor Selection
di: Don-Yehiya, Shachar, et al.
Pubblicazione: (2026)

Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization
di: Uzan, Omri, et al.
Pubblicazione: (2025)

Label-Efficient Model Selection for Text Generation
di: Ashury-Tahan, Shir, et al.
Pubblicazione: (2024)

The Mighty ToRR: A Benchmark for Table Reasoning and Robustness
di: Ashury-Tahan, Shir, et al.
Pubblicazione: (2025)

DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation
di: Habba, Eliya, et al.
Pubblicazione: (2025)

Growing Pains: Extensible and Efficient LLM Benchmarking Via Fixed Parameter Calibration
di: Habba, Eliya, et al.
Pubblicazione: (2026)

Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs
di: Peisakhovsky, Yehonatan, et al.
Pubblicazione: (2025)

The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community
di: Don-Yehiya, Shachar, et al.
Pubblicazione: (2024)

Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation for Generative AI
di: Bandel, Elron, et al.
Pubblicazione: (2024)

NumeroLogic: Number Encoding for Enhanced LLMs' Numerical Reasoning
di: Schwartz, Eli, et al.
Pubblicazione: (2024)

Teaching Values to Machines: Simulating Human-Like Behavior in LLMs
di: Yehudai, Asaf, et al.
Pubblicazione: (2026)

Can Gradient Descent Simulate Prompting?
di: Zhang, Eric, et al.
Pubblicazione: (2025)

A Hitchhiker's Guide to Scaling Law Estimation
di: Choshen, Leshem, et al.
Pubblicazione: (2024)

Fuse to Forget: Bias Reduction and Selective Memorization through Model Fusion
di: Zaman, Kerem, et al.
Pubblicazione: (2023)

Naturally Occurring Feedback is Common, Extractable and Useful
di: Don-Yehiya, Shachar, et al.
Pubblicazione: (2024)

When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes
di: Yehudai, Asaf, et al.
Pubblicazione: (2024)

Concept-Best-Matching: Evaluating Compositionality in Emergent Communication
di: Carmeli, Boaz, et al.
Pubblicazione: (2024)

Instructions Shape Production of Language, not Processing
di: Waldis, Andreas, et al.
Pubblicazione: (2026)

Jump to Conclusions: Short-Cutting Transformers With Linear Transformations
di: Din, Alexander Yom, et al.
Pubblicazione: (2023)

ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
di: Yadav, Prateek, et al.
Pubblicazione: (2023)

Agentic CLEAR: Automating Multi-Level Evaluation of LLM Agents
di: Yehudai, Asaf, et al.
Pubblicazione: (2026)

Pretraining Language Models for Diachronic Linguistic Change Discovery
di: Fittschen, Elisabeth, et al.
Pubblicazione: (2025)

Holmes: A Benchmark to Assess the Linguistic Competence of Language Models
di: Waldis, Andreas, et al.
Pubblicazione: (2024)

GeniL: A Multilingual Dataset on Generalizing Language
di: Davani, Aida Mostafazadeh, et al.
Pubblicazione: (2024)

Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation
di: Iluz, Bar, et al.
Pubblicazione: (2024)

Beneath the Surface of Consistency: Exploring Cross-lingual Knowledge Representation Sharing in LLMs
di: Ifergan, Maxim, et al.
Pubblicazione: (2024)

A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns
di: Yehudai, Asaf, et al.
Pubblicazione: (2024)

Deductive Closure Training of Language Models for Coherence, Accuracy, and Updatability
di: Akyürek, Afra Feyza, et al.
Pubblicazione: (2024)

LLM Hypnosis: Exploiting User Feedback for Unauthorized Knowledge Injection to All Users
di: Hilel, Almog, et al.
Pubblicazione: (2025)

Do LLMs Benefit From Their Own Words?
di: Huang, Jenny Y., et al.
Pubblicazione: (2026)

Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models
di: Gupta, Sonam, et al.
Pubblicazione: (2025)

WildIFEval: Instruction Following in the Wild
di: Lior, Gili, et al.
Pubblicazione: (2025)

Resolving Interference (RI): Disentangling Models for Improved Model Merging
di: Ramesh, Pratik, et al.
Pubblicazione: (2026)

ErrorMap and ErrorAtlas: Charting the Failure Landscape of Large Language Models
di: Ashury-Tahan, Shir, et al.
Pubblicazione: (2026)

An Analysis of Hyper-Parameter Optimization Methods for Retrieval Augmented Generation
di: Orbach, Matan, et al.
Pubblicazione: (2025)

Tab-MIA: A Benchmark Dataset for Membership Inference Attacks on Tabular Data in LLMs
di: German, Eyal, et al.
Pubblicazione: (2025)

Robustness as an Emergent Property of Task Performance
di: Ashury-Tahan, Shir, et al.
Pubblicazione: (2026)