:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kaneko, Masahiro, Neubig, Graham, Okazaki, Naoaki
Format:	Preprint
Veröffentlicht:	2023
Schlagworte:	Computation and Language
Online-Zugang:	https://arxiv.org/abs/2305.11789
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

A Japanese Benchmark for Evaluating Social Bias in Reasoning Based on Attribution Theory
von: Shiotani, Taihei, et al.
Veröffentlicht: (2026)

Evaluating Gender Bias of Pre-trained Language Models in Natural Language Inference by Considering All Labels
von: Anantaprayoon, Panatchakorn, et al.
Veröffentlicht: (2023)

OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with Adversarially Generated Examples
von: Koike, Ryuto, et al.
Veröffentlicht: (2023)

How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection
von: Koike, Ryuto, et al.
Veröffentlicht: (2023)

SAIE Framework: Support Alone Isn't Enough -- Advancing LLM Training with Adversarial Remarks
von: Loem, Mengsay, et al.
Veröffentlicht: (2023)

Social Bias Evaluation for Large Language Models Requires Prompt Variations
von: Hida, Rem, et al.
Veröffentlicht: (2024)

Intent-Aware Self-Correction for Mitigating Social Biases in Large Language Models
von: Anantaprayoon, Panatchakorn, et al.
Veröffentlicht: (2025)

Sampling-based Pseudo-Likelihood for Membership Inference Attacks
von: Kaneko, Masahiro, et al.
Veröffentlicht: (2024)

ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability
von: Koike, Ryuto, et al.
Veröffentlicht: (2025)

LLM Output Detectability and Task Performance Can be Jointly Optimized
von: Saito, Koshiro, et al.
Veröffentlicht: (2026)

Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting
von: Kaneko, Masahiro, et al.
Veröffentlicht: (2024)

Stopping Computation for Converged Tokens in Masked Diffusion-LM Decoding
von: Oba, Daisuke, et al.
Veröffentlicht: (2026)

Multi-modal, Multi-task, Multi-criteria Automatic Evaluation with Vision Language Models
von: Ohi, Masanari, et al.
Veröffentlicht: (2024)

Likelihood-based Mitigation of Evaluation Bias in Large Language Models
von: Oi, Masanari, et al.
Veröffentlicht: (2024)

Machine Text Detectors are Membership Inference Attacks
von: Koike, Ryuto, et al.
Veröffentlicht: (2025)

JUBAKU: An Adversarial Benchmark for Exposing Culturally Grounded Stereotypes in Japanese LLMs
von: Shiotani, Taihei, et al.
Veröffentlicht: (2026)

Tokenization as Finite-State Transduction
von: Cognetta, Marco, et al.
Veröffentlicht: (2024)

From Interpretability to Performance: Optimizing Retrieval Heads for Long-Context Language Models
von: Ma, Youmi, et al.
Veröffentlicht: (2026)

Knowledge of Pretrained Language Models on Surface Information of Tokens
von: Hiraoka, Tatsuya, et al.
Veröffentlicht: (2024)

Building a Japanese Document-Level Relation Extraction Dataset Assisted by Cross-Lingual Transfer
von: Ma, Youmi, et al.
Veröffentlicht: (2024)

QuantumBench: A Benchmark for Quantum Problem Solving
von: Minami, Shunya, et al.
Veröffentlicht: (2025)

Bit-level BPE: Below the byte boundary
von: Moon, Sangwhan, et al.
Veröffentlicht: (2025)

Distributional Properties of Subword Regularization
von: Cognetta, Marco, et al.
Veröffentlicht: (2024)

Drifting Objectives for Refining Discrete Diffusion Language Models
von: Oba, Daisuke, et al.
Veröffentlicht: (2026)

Diffusion-State Policy Optimization for Masked Diffusion Language Models
von: Oba, Daisuke, et al.
Veröffentlicht: (2026)

Aligning Tree-Search Policies with Fixed Token Budgets in Test-Time Scaling of LLMs
von: Miyamoto, Sora, et al.
Veröffentlicht: (2026)

Two Counterexamples to Tokenization and the Noiseless Channel
von: Cognetta, Marco, et al.
Veröffentlicht: (2024)

On the Alignment of Large Language Models with Global Human Opinion
von: Liu, Yang, et al.
Veröffentlicht: (2025)

Go-Browse: Training Web Agents with Structured Exploration
von: Gandhi, Apurva, et al.
Veröffentlicht: (2025)

BehaviorBox: Automated Discovery of Fine-Grained Performance Differences Between Language Models
von: Tjuatja, Lindia, et al.
Veröffentlicht: (2025)

Beyond the Resumé: A Rubric-Aware Automatic Interview System for Information Elicitation
von: Stuart, Harry, et al.
Veröffentlicht: (2026)

Decoding-Free Sampling Strategies for LLM Marginalization
von: Pohl, David, et al.
Veröffentlicht: (2025)

CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation
von: Huq, Faria, et al.
Veröffentlicht: (2025)

Effective Strategies for Asynchronous Software Engineering Agents
von: Geng, Jiayi, et al.
Veröffentlicht: (2026)

Paraphrasing Adversarial Attack on LLM-as-a-Reviewer
von: Kaneko, Masahiro
Veröffentlicht: (2026)

A Little Leak Will Sink a Great Ship: Survey of Transparency for Large Language Models from Start to Finish
von: Kaneko, Masahiro, et al.
Veröffentlicht: (2024)

Synthesizing Instruction-Tuning Datasets with Contrastive Decoding
von: Ichinose, Tatsuya, et al.
Veröffentlicht: (2026)

An Analysis of BPE Vocabulary Trimming in Neural Machine Translation
von: Cognetta, Marco, et al.
Veröffentlicht: (2024)

RAGGED: Towards Informed Design of Scalable and Stable RAG Systems
von: Hsia, Jennifer, et al.
Veröffentlicht: (2024)

Coding Agents with Multimodal Browsing are Generalist Problem Solvers
von: Soni, Aditya Bharat, et al.
Veröffentlicht: (2025)