Saved in:
| Main Authors: | Matsuzaki, Fuka, Sato, Haru-Tada |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.05665 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Ensemble Bayesian Inference: Leveraging Small Language Models to Achieve LLM-level Accuracy in Profile Matching Tasks
by: Sato, Haru-Tada, et al.
Published: (2025)
by: Sato, Haru-Tada, et al.
Published: (2025)
MskAge —An Epigenetic Biomarker of Musculoskeletal Age Derived From a Genetic Algorithm Islands Model
by: Daniel C. Green, et al.
Published: (2025)
by: Daniel C. Green, et al.
Published: (2025)
Gauge Invariant and Generic Formulation of Magnetic Translations and so(3,1) Curtright-Zachos Generators
by: Sato, Haru-Tada
Published: (2025)
by: Sato, Haru-Tada
Published: (2025)
Quantum Superspace and Bloch Electron Systems with Zeeman Effects: *-Bracket Formalism for Super Curtright-Zachos Algebras
by: Sato, Haru-Tada
Published: (2024)
by: Sato, Haru-Tada
Published: (2024)
Curtright-Zachos Supersymmetric Deformations of the Virasoro algebra in Quantum Superspace and Bloch Electron Systems
by: Sato, Haru-Tada
Published: (2024)
by: Sato, Haru-Tada
Published: (2024)
Moyal product and Generalized Hom-Lie-Virasoro symmetries in Bloch electron systems
by: Sato, Haru-Tada
Published: (2024)
by: Sato, Haru-Tada
Published: (2024)
IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce
by: Ding, Wenxuan, et al.
Published: (2024)
by: Ding, Wenxuan, et al.
Published: (2024)
StrucText-Eval: Evaluating Large Language Model's Reasoning Ability in Structure-Rich Text
by: Gu, Zhouhong, et al.
Published: (2024)
by: Gu, Zhouhong, et al.
Published: (2024)
SEC-QA: A Systematic Evaluation Corpus for Financial QA
by: Lai, Viet Dac, et al.
Published: (2024)
by: Lai, Viet Dac, et al.
Published: (2024)
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models
by: Parmar, Mihir, et al.
Published: (2024)
by: Parmar, Mihir, et al.
Published: (2024)
Evaluating the Process Modeling Abilities of Large Language Models -- Preliminary Foundations and Results
by: Fettke, Peter, et al.
Published: (2025)
by: Fettke, Peter, et al.
Published: (2025)
KatotohananQA: Evaluating Truthfulness of Large Language Models in Filipino
by: Nery, Lorenzo Alfred, et al.
Published: (2025)
by: Nery, Lorenzo Alfred, et al.
Published: (2025)
Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model
by: Aggarwal, Divyanshu, et al.
Published: (2024)
by: Aggarwal, Divyanshu, et al.
Published: (2024)
On the Reasoning Abilities of Masked Diffusion Language Models
by: Svete, Anej, et al.
Published: (2025)
by: Svete, Anej, et al.
Published: (2025)
Investigating Large Language Models' Linguistic Abilities for Text Preprocessing
by: Braga, Marco, et al.
Published: (2025)
by: Braga, Marco, et al.
Published: (2025)
Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
by: He, Yancheng, et al.
Published: (2024)
by: He, Yancheng, et al.
Published: (2024)
Labrador: Exploring the Limits of Masked Language Modeling for Laboratory Data
by: Bellamy, David R., et al.
Published: (2023)
by: Bellamy, David R., et al.
Published: (2023)
ICLEval: Evaluating In-Context Learning Ability of Large Language Models
by: Chen, Wentong, et al.
Published: (2024)
by: Chen, Wentong, et al.
Published: (2024)
Exploring Performance Contrasts in TableQA: Step-by-Step Reasoning Boosts Bigger Language Models, Limits Smaller Language Models
by: Yang, Haoyan, et al.
Published: (2024)
by: Yang, Haoyan, et al.
Published: (2024)
Can Perplexity Reflect Large Language Model's Ability in Long Text Understanding?
by: Hu, Yutong, et al.
Published: (2024)
by: Hu, Yutong, et al.
Published: (2024)
TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models
by: Yu, Ping, et al.
Published: (2024)
by: Yu, Ping, et al.
Published: (2024)
Evaluating the Ability of Large Language Models to Reason about Cardinal Directions
by: Cohn, Anthony G, et al.
Published: (2024)
by: Cohn, Anthony G, et al.
Published: (2024)
Masking in Multi-hop QA: An Analysis of How Language Models Perform with Context Permutation
by: Huang, Wenyu, et al.
Published: (2025)
by: Huang, Wenyu, et al.
Published: (2025)
Exploring Language Model Generalization in Low-Resource Extractive QA
by: Sengupta, Saptarshi, et al.
Published: (2024)
by: Sengupta, Saptarshi, et al.
Published: (2024)
Exploring the Limits of Model Compression in LLMs: A Knowledge Distillation Study on QA Tasks
by: Datta, Joyeeta, et al.
Published: (2025)
by: Datta, Joyeeta, et al.
Published: (2025)
RephQA: Evaluating Readability of Large Language Models in Public Health Question Answering
by: Qiu, Weikang, et al.
Published: (2025)
by: Qiu, Weikang, et al.
Published: (2025)
BaziQA-Benchmark: Evaluating Symbolic and Temporally Compositional Reasoning in Large Language Models
by: Chen, Jiangxi, et al.
Published: (2026)
by: Chen, Jiangxi, et al.
Published: (2026)
Evaluating the Ability of Large Language Models to Reason about Cardinal Directions, Revisited
by: Cohn, Anthony G, et al.
Published: (2025)
by: Cohn, Anthony G, et al.
Published: (2025)
Evaluation of Instruction-Following Ability for Large Language Models on Story-Ending Generation
by: Hida, Rem, et al.
Published: (2024)
by: Hida, Rem, et al.
Published: (2024)
Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability
by: Xu, Zhuoyan, et al.
Published: (2024)
by: Xu, Zhuoyan, et al.
Published: (2024)
FinanceQA: A Benchmark for Evaluating Financial Analysis Capabilities of Large Language Models
by: Mateega, Spencer, et al.
Published: (2025)
by: Mateega, Spencer, et al.
Published: (2025)
Lost in the Middle, and In-Between: Enhancing Language Models' Ability to Reason Over Long Contexts in Multi-Hop QA
by: Baker, George Arthur, et al.
Published: (2024)
by: Baker, George Arthur, et al.
Published: (2024)
Exploring the Effects of Alignment on Numerical Bias in Large Language Models
by: Sato, Ayako, et al.
Published: (2026)
by: Sato, Ayako, et al.
Published: (2026)
Exploring the Limitations of Large Language Models in Compositional Relation Reasoning
by: Zhao, Jinman, et al.
Published: (2024)
by: Zhao, Jinman, et al.
Published: (2024)
ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language Models
by: Chen, Haibin, et al.
Published: (2025)
by: Chen, Haibin, et al.
Published: (2025)
Large Language Models for Healthcare Text Classification: A Systematic Review
by: Sakai, Hajar, et al.
Published: (2025)
by: Sakai, Hajar, et al.
Published: (2025)
A Comprehensive Evaluation of Large Language Models on Benchmark Biomedical Text Processing Tasks
by: Jahan, Israt, et al.
Published: (2023)
by: Jahan, Israt, et al.
Published: (2023)
NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models
by: Han, Han, et al.
Published: (2024)
by: Han, Han, et al.
Published: (2024)
Assessing Large Language Models for Medical QA: Zero-Shot and LLM-as-a-Judge Evaluation
by: Adib, Shefayat E Shams, et al.
Published: (2026)
by: Adib, Shefayat E Shams, et al.
Published: (2026)
Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models
by: Li, Jinzhe, et al.
Published: (2025)
by: Li, Jinzhe, et al.
Published: (2025)
Similar Items
-
Ensemble Bayesian Inference: Leveraging Small Language Models to Achieve LLM-level Accuracy in Profile Matching Tasks
by: Sato, Haru-Tada, et al.
Published: (2025) -
MskAge —An Epigenetic Biomarker of Musculoskeletal Age Derived From a Genetic Algorithm Islands Model
by: Daniel C. Green, et al.
Published: (2025) -
Gauge Invariant and Generic Formulation of Magnetic Translations and so(3,1) Curtright-Zachos Generators
by: Sato, Haru-Tada
Published: (2025) -
Quantum Superspace and Bloch Electron Systems with Zeeman Effects: *-Bracket Formalism for Super Curtright-Zachos Algebras
by: Sato, Haru-Tada
Published: (2024) -
Curtright-Zachos Supersymmetric Deformations of the Virasoro algebra in Quantum Superspace and Bloch Electron Systems
by: Sato, Haru-Tada
Published: (2024)