:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bommarito, Michael J, Katz, Daniel Martin, Bommarito, Jillian
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2504.04131
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing Applications
by: Bommarito, Michael J, et al.
Published: (2025)

The KL3M Data Project: Copyright-Clean Training Resources for Large Language Models
by: Bommarito II, Michael J, et al.
Published: (2025)

Natural Language Processing in the Legal Domain
by: Hartung, Dirk, et al.
Published: (2023)

OpenGloss: A Synthetic Encyclopedic Dictionary and Semantic Knowledge Graph
by: Bommarito II, Michael J.
Published: (2025)

Needles at Scale: LLM-Assisted Target Selection for Windows Vulnerability Research
by: Bommarito II, Michael J.
Published: (2026)

Explicating the Implicit: Argument Detection Beyond Sentence Boundaries
by: Roit, Paul, et al.
Published: (2024)

Binary-30K: A Heterogeneous Dataset for Deep Learning in Binary Analysis and Malware Detection
by: Bommarito II, Michael J.
Published: (2025)

Binary BPE: A Family of Cross-Platform Tokenizers for Binary Analysis
by: Bommarito II, Michael J.
Published: (2025)

Think in Sentences: Explicit Sentence Boundaries Enhance Language Model's Capabilities
by: Liu, Zhichen, et al.
Published: (2026)

CharED: Character-wise Ensemble Decoding for Large Language Models
by: Gu, Kevin, et al.
Published: (2024)

JUDGEBERT: Assessing Legal Meaning Preservation Between Sentences
by: Beauchemin, David, et al.
Published: (2025)

Are we prematurely predicting acute mountain sickness?
by: Julian C. Bommarito, et al.
Published: (2025)

Metabolic override: Adding neuropeptide Y to the list of vasoconstrictors attenuated by exercise
by: Julian C. Bommarito, et al.
Published: (2025)

Purging the Gray Zone: Latent-Geometric Denoising for Precise Knowledge Boundary Awareness
by: An, Hao, et al.
Published: (2026)

AWARE, Beyond Sentence Boundaries: A Contextual Transformer Framework for Identifying Cultural Capital in STEM Narratives
by: Khan, Khalid Mehtab, et al.
Published: (2025)

KBM: Delineating Knowledge Boundary for Adaptive Retrieval in Large Language Models
by: Zhang, Zhen, et al.
Published: (2024)

Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation
by: Ren, Ruiyang, et al.
Published: (2023)

A Reasoning-Focused Legal Retrieval Benchmark
by: Zheng, Lucia, et al.
Published: (2025)

CharBench: Evaluating the Role of Tokenization in Character-Level Tasks
by: Uzan, Omri, et al.
Published: (2025)

Rethinking the Reranker: Boundary-Aware Evidence Selection for Robust Retrieval-Augmented Generation
by: Sun, Jiashuo, et al.
Published: (2026)

Legal-DC: Benchmarking Retrieval-Augmented Generation for Legal Documents
by: Li, Yaocong, et al.
Published: (2026)

AlignAR: Generative Sentence Alignment for Arabic-English Parallel Corpora of Legal and Literary Texts
by: Huang, Baorong, et al.
Published: (2025)

Concise and Sufficient Sub-Sentence Citations for Retrieval-Augmented Generation
by: Chen, Guo, et al.
Published: (2025)

Better Language Model-Based Judging Reward Modeling through Scaling Comprehension Boundaries
by: Ning, Meiling, et al.
Published: (2025)

Learning the Boundary of Solvability: Aligning LLMs to Detect Unsolvable Problems
by: Peng, Dengyun, et al.
Published: (2025)

LegalSearchLM: Rethinking Legal Case Retrieval as Legal Elements Generation
by: Kim, Chaeeun, et al.
Published: (2025)

A Large-Scale Benchmark for Vietnamese Sentence Paraphrases
by: Nguyen, Sang Quang, et al.
Published: (2025)

Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference
by: Chen, Zhuo, et al.
Published: (2025)

HingeMem: Boundary Guided Long-Term Memory with Query Adaptive Retrieval for Scalable Dialogues
by: Zhong, Yijie, et al.
Published: (2026)

CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
by: Wang, Zirui, et al.
Published: (2024)

Nudging the Boundaries of LLM Reasoning
by: Chen, Justin Chih-Yao, et al.
Published: (2025)

Human activities influence mule deer use of overpasses across multiple scales
by: Kaela M. Hamilton, et al.
Published: (2024)

ASVRI-Legal: Fine-Tuning LLMs with Retrieval Augmented Generation for Enhanced Legal Regulation
by: Octadion, One, et al.
Published: (2025)

Scaling Towards the Information Boundary of Instruction Sets: The Infinity Instruct Subject Technical Report
by: Du, Li, et al.
Published: (2025)

Parsing Through Boundaries in Chinese Word Segmentation
by: Chen, Yige, et al.
Published: (2025)

Expect the unexpected: Harnessing Sentence Completion for Sarcasm Detection
by: Joshi, Aditya, et al.
Published: (2017)

CLERC: A Dataset for Legal Case Retrieval and Retrieval-Augmented Analysis Generation
by: Hou, Abe Bohan, et al.
Published: (2024)

Modeling Sequential Sentence Relation to Improve Cross-lingual Dense Retrieval
by: Zhang, Shunyu, et al.
Published: (2023)

ParetoRAG: Leveraging Sentence-Context Attention for Robust and Efficient Retrieval-Augmented Generation
by: Yao, Ruobing, et al.
Published: (2025)

LRAGE: Legal Retrieval Augmented Generation Evaluation Tool
by: Park, Minhu, et al.
Published: (2025)