Enregistré dans:
| Auteurs principaux: | Lin, Tzu-Mi, Hirota, Wataru, Ishigaki, Tatsuya, Lee, Lung-Hao, Chen, Chung-Chi |
|---|---|
| Format: | Preprint |
| Publié: |
2026
|
| Sujets: | |
| Accès en ligne: | https://arxiv.org/abs/2605.04972 |
| Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|
Documents similaires
Aggregate vs. Personalized Judges in Business Idea Evaluation: Evidence from Expert Disagreement
par: Hirota, Wataru, et autres
Publié: (2026)
par: Hirota, Wataru, et autres
Publié: (2026)
Exploring Design of Multi-Agent LLM Dialogues for Research Ideation
par: Ueda, Keisuke, et autres
Publié: (2025)
par: Ueda, Keisuke, et autres
Publié: (2025)
Evaluating Large Language Models as Expert Annotators
par: Tseng, Yu-Min, et autres
Publié: (2025)
par: Tseng, Yu-Min, et autres
Publié: (2025)
Multimodal Task Interference: A Benchmark and Analysis of History-Target Mismatch in Multimodal LLMs
par: Kawarada, Masayuki, et autres
Publié: (2026)
par: Kawarada, Masayuki, et autres
Publié: (2026)
Prompting for Numerical Sequences: A Case Study on Market Comment Generation
par: Kawarada, Masayuki, et autres
Publié: (2024)
par: Kawarada, Masayuki, et autres
Publié: (2024)
Are Expert-Level Language Models Expert-Level Annotators?
par: Tseng, Yu-Min, et autres
Publié: (2024)
par: Tseng, Yu-Min, et autres
Publié: (2024)
Modeling Professionalism in Expert Questioning through Linguistic Differentiation
par: D'Agostino, Giulia, et autres
Publié: (2025)
par: D'Agostino, Giulia, et autres
Publié: (2025)
Pretraining and Updates of Domain-Specific LLM: A Case Study in the Japanese Business Domain
par: Takahashi, Kosuke, et autres
Publié: (2024)
par: Takahashi, Kosuke, et autres
Publié: (2024)
A Comparative Study of Demonstration Selection for Practical Large Language Models-based Next POI Prediction
par: Nishida, Ryo, et autres
Publié: (2026)
par: Nishida, Ryo, et autres
Publié: (2026)
Enhancing Financial Sentiment Analysis with Expert-Designed Hint
par: Chen, Chung-Chi, et autres
Publié: (2024)
par: Chen, Chung-Chi, et autres
Publié: (2024)
"Why" Has the Least Side Effect on Model Editing
par: Pan, Tsung-Hsuan, et autres
Publié: (2024)
par: Pan, Tsung-Hsuan, et autres
Publié: (2024)
Beyond Turing Test: Can GPT-4 Sway Experts' Decisions?
par: Takayanagi, Takehiro, et autres
Publié: (2024)
par: Takayanagi, Takehiro, et autres
Publié: (2024)
From Facts to Insights: A Study on the Generation and Evaluation of Analytical Reports for Deciphering Earnings Calls
par: Goldsack, Tomas, et autres
Publié: (2024)
par: Goldsack, Tomas, et autres
Publié: (2024)
An Effective Mixture-Of-Experts Approach For Code-Switching Speech Recognition Leveraging Encoder Disentanglement
par: Yang, Tzu-Ting, et autres
Publié: (2024)
par: Yang, Tzu-Ting, et autres
Publié: (2024)
Recurrent Alignment with Hard Attention for Hierarchical Text Rating
par: Lin, Chenxi, et autres
Publié: (2024)
par: Lin, Chenxi, et autres
Publié: (2024)
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment
par: Liu, Zhili, et autres
Publié: (2024)
par: Liu, Zhili, et autres
Publié: (2024)
VIBE: Voice-Induced open-ended Bias Evaluation for Large Audio-Language Models via Real-World Speech
par: Lin, Yi-Cheng, et autres
Publié: (2026)
par: Lin, Yi-Cheng, et autres
Publié: (2026)
Decision-Oriented Text Evaluation
par: Huang, Yu-Shiang, et autres
Publié: (2025)
par: Huang, Yu-Shiang, et autres
Publié: (2025)
Enhancing Medication Recommendation with LLM Text Representation
par: Lee, Yu-Tzu
Publié: (2024)
par: Lee, Yu-Tzu
Publié: (2024)
When Silence Matters: The Impact of Irrelevant Audio on Text Reasoning in Large Audio-Language Models
par: Li, Chen-An, et autres
Publié: (2025)
par: Li, Chen-An, et autres
Publié: (2025)
The Lookahead Limitation: Why Multi-Operand Addition is Hard for LLMs
par: Baeumel, Tanja, et autres
Publié: (2025)
par: Baeumel, Tanja, et autres
Publié: (2025)
Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas
par: Chen, Shiqi, et autres
Publié: (2025)
par: Chen, Shiqi, et autres
Publié: (2025)
Schema Lineage Extraction at Scale: Multilingual Pipelines, Composite Evaluation, and Language-Model Benchmarks
par: Yin, Jiaqi, et autres
Publié: (2025)
par: Yin, Jiaqi, et autres
Publié: (2025)
Real-Time Generation of Game Video Commentary with Multimodal LLMs: Pause-Aware Decoding Approaches
par: Afzal, Anum, et autres
Publié: (2026)
par: Afzal, Anum, et autres
Publié: (2026)
MelHuBERT: A simplified HuBERT on Mel spectrograms
par: Lin, Tzu-Quan, et autres
Publié: (2022)
par: Lin, Tzu-Quan, et autres
Publié: (2022)
DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging
par: Lin, Tzu-Han, et autres
Publié: (2024)
par: Lin, Tzu-Han, et autres
Publié: (2024)
FinNuE: Exposing the Risks of Using BERTScore for Numerical Semantic Evaluation in Finance
par: Huang, Yu-Shiang, et autres
Publié: (2025)
par: Huang, Yu-Shiang, et autres
Publié: (2025)
DimStance: Multilingual Datasets for Dimensional Stance Analysis
par: Becker, Jonas, et autres
Publié: (2026)
par: Becker, Jonas, et autres
Publié: (2026)
Not All Subjectivity Is the Same! Defining Desiderata for the Evaluation of Subjectivity in NLP
par: Khurana, Urja, et autres
Publié: (2026)
par: Khurana, Urja, et autres
Publié: (2026)
Poor Alignment and Steerability of Large Language Models: Evidence from College Admission Essays
par: Lee, Jinsook, et autres
Publié: (2025)
par: Lee, Jinsook, et autres
Publié: (2025)
Routing Absorption in Sparse Attention: Why Random Gates Are Hard to Beat
par: Aquino-Michaels, Keston
Publié: (2026)
par: Aquino-Michaels, Keston
Publié: (2026)
Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization
par: Lin, Tzu-Quan, et autres
Publié: (2025)
par: Lin, Tzu-Quan, et autres
Publié: (2025)
ARCH2S: Dataset, Benchmark and Challenges for Learning Exterior Architectural Structures from Point Clouds
par: Cheung, Ka Lung, et autres
Publié: (2024)
par: Cheung, Ka Lung, et autres
Publié: (2024)
DEER: A Benchmark for Evaluating Deep Research Agents on Expert Report Generation
par: Han, Janghoon, et autres
Publié: (2025)
par: Han, Janghoon, et autres
Publié: (2025)
Commitment Checklist: Auditing Author Commitments in Peer Review
par: Chen, Chung-Chi, et autres
Publié: (2026)
par: Chen, Chung-Chi, et autres
Publié: (2026)
Detection Is Cheap, Routing Is Learned: Why Refusal-Based Alignment Evaluation Fails
par: Frank, Gregory N.
Publié: (2026)
par: Frank, Gregory N.
Publié: (2026)
Towards Efficient Visual-Language Alignment of the Q-Former for Visual Reasoning Tasks
par: Kim, Sungkyung, et autres
Publié: (2024)
par: Kim, Sungkyung, et autres
Publié: (2024)
AdaSearch: Balancing Parametric Knowledge and Search in Large Language Models via Reinforcement Learning
par: Lin, Tzu-Han, et autres
Publié: (2025)
par: Lin, Tzu-Han, et autres
Publié: (2025)
Why Do We Laugh? Annotation and Taxonomy Generation for Laughable Contexts in Spontaneous Text Conversation
par: Inoue, Koji, et autres
Publié: (2025)
par: Inoue, Koji, et autres
Publié: (2025)
Property Neurons in Self-Supervised Speech Transformers
par: Lin, Tzu-Quan, et autres
Publié: (2024)
par: Lin, Tzu-Quan, et autres
Publié: (2024)
Documents similaires
-
Aggregate vs. Personalized Judges in Business Idea Evaluation: Evidence from Expert Disagreement
par: Hirota, Wataru, et autres
Publié: (2026) -
Exploring Design of Multi-Agent LLM Dialogues for Research Ideation
par: Ueda, Keisuke, et autres
Publié: (2025) -
Evaluating Large Language Models as Expert Annotators
par: Tseng, Yu-Min, et autres
Publié: (2025) -
Multimodal Task Interference: A Benchmark and Analysis of History-Target Mismatch in Multimodal LLMs
par: Kawarada, Masayuki, et autres
Publié: (2026) -
Prompting for Numerical Sequences: A Case Study on Market Comment Generation
par: Kawarada, Masayuki, et autres
Publié: (2024)