Saved in:
| Main Authors: | Bommarito, Michael J, Katz, Daniel Martin, Bommarito, Jillian |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.04131 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing Applications
by: Bommarito, Michael J, et al.
Published: (2025)
by: Bommarito, Michael J, et al.
Published: (2025)
The KL3M Data Project: Copyright-Clean Training Resources for Large Language Models
by: Bommarito II, Michael J, et al.
Published: (2025)
by: Bommarito II, Michael J, et al.
Published: (2025)
Natural Language Processing in the Legal Domain
by: Hartung, Dirk, et al.
Published: (2023)
by: Hartung, Dirk, et al.
Published: (2023)
OpenGloss: A Synthetic Encyclopedic Dictionary and Semantic Knowledge Graph
by: Bommarito II, Michael J.
Published: (2025)
by: Bommarito II, Michael J.
Published: (2025)
Needles at Scale: LLM-Assisted Target Selection for Windows Vulnerability Research
by: Bommarito II, Michael J.
Published: (2026)
by: Bommarito II, Michael J.
Published: (2026)
Explicating the Implicit: Argument Detection Beyond Sentence Boundaries
by: Roit, Paul, et al.
Published: (2024)
by: Roit, Paul, et al.
Published: (2024)
Binary-30K: A Heterogeneous Dataset for Deep Learning in Binary Analysis and Malware Detection
by: Bommarito II, Michael J.
Published: (2025)
by: Bommarito II, Michael J.
Published: (2025)
Binary BPE: A Family of Cross-Platform Tokenizers for Binary Analysis
by: Bommarito II, Michael J.
Published: (2025)
by: Bommarito II, Michael J.
Published: (2025)
Think in Sentences: Explicit Sentence Boundaries Enhance Language Model's Capabilities
by: Liu, Zhichen, et al.
Published: (2026)
by: Liu, Zhichen, et al.
Published: (2026)
CharED: Character-wise Ensemble Decoding for Large Language Models
by: Gu, Kevin, et al.
Published: (2024)
by: Gu, Kevin, et al.
Published: (2024)
JUDGEBERT: Assessing Legal Meaning Preservation Between Sentences
by: Beauchemin, David, et al.
Published: (2025)
by: Beauchemin, David, et al.
Published: (2025)
Are we prematurely predicting acute mountain sickness?
by: Julian C. Bommarito, et al.
Published: (2025)
by: Julian C. Bommarito, et al.
Published: (2025)
Metabolic override: Adding neuropeptide Y to the list of vasoconstrictors attenuated by exercise
by: Julian C. Bommarito, et al.
Published: (2025)
by: Julian C. Bommarito, et al.
Published: (2025)
Purging the Gray Zone: Latent-Geometric Denoising for Precise Knowledge Boundary Awareness
by: An, Hao, et al.
Published: (2026)
by: An, Hao, et al.
Published: (2026)
AWARE, Beyond Sentence Boundaries: A Contextual Transformer Framework for Identifying Cultural Capital in STEM Narratives
by: Khan, Khalid Mehtab, et al.
Published: (2025)
by: Khan, Khalid Mehtab, et al.
Published: (2025)
KBM: Delineating Knowledge Boundary for Adaptive Retrieval in Large Language Models
by: Zhang, Zhen, et al.
Published: (2024)
by: Zhang, Zhen, et al.
Published: (2024)
Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation
by: Ren, Ruiyang, et al.
Published: (2023)
by: Ren, Ruiyang, et al.
Published: (2023)
A Reasoning-Focused Legal Retrieval Benchmark
by: Zheng, Lucia, et al.
Published: (2025)
by: Zheng, Lucia, et al.
Published: (2025)
CharBench: Evaluating the Role of Tokenization in Character-Level Tasks
by: Uzan, Omri, et al.
Published: (2025)
by: Uzan, Omri, et al.
Published: (2025)
Rethinking the Reranker: Boundary-Aware Evidence Selection for Robust Retrieval-Augmented Generation
by: Sun, Jiashuo, et al.
Published: (2026)
by: Sun, Jiashuo, et al.
Published: (2026)
Legal-DC: Benchmarking Retrieval-Augmented Generation for Legal Documents
by: Li, Yaocong, et al.
Published: (2026)
by: Li, Yaocong, et al.
Published: (2026)
AlignAR: Generative Sentence Alignment for Arabic-English Parallel Corpora of Legal and Literary Texts
by: Huang, Baorong, et al.
Published: (2025)
by: Huang, Baorong, et al.
Published: (2025)
Concise and Sufficient Sub-Sentence Citations for Retrieval-Augmented Generation
by: Chen, Guo, et al.
Published: (2025)
by: Chen, Guo, et al.
Published: (2025)
Better Language Model-Based Judging Reward Modeling through Scaling Comprehension Boundaries
by: Ning, Meiling, et al.
Published: (2025)
by: Ning, Meiling, et al.
Published: (2025)
Learning the Boundary of Solvability: Aligning LLMs to Detect Unsolvable Problems
by: Peng, Dengyun, et al.
Published: (2025)
by: Peng, Dengyun, et al.
Published: (2025)
LegalSearchLM: Rethinking Legal Case Retrieval as Legal Elements Generation
by: Kim, Chaeeun, et al.
Published: (2025)
by: Kim, Chaeeun, et al.
Published: (2025)
A Large-Scale Benchmark for Vietnamese Sentence Paraphrases
by: Nguyen, Sang Quang, et al.
Published: (2025)
by: Nguyen, Sang Quang, et al.
Published: (2025)
Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference
by: Chen, Zhuo, et al.
Published: (2025)
by: Chen, Zhuo, et al.
Published: (2025)
HingeMem: Boundary Guided Long-Term Memory with Query Adaptive Retrieval for Scalable Dialogues
by: Zhong, Yijie, et al.
Published: (2026)
by: Zhong, Yijie, et al.
Published: (2026)
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
by: Wang, Zirui, et al.
Published: (2024)
by: Wang, Zirui, et al.
Published: (2024)
Nudging the Boundaries of LLM Reasoning
by: Chen, Justin Chih-Yao, et al.
Published: (2025)
by: Chen, Justin Chih-Yao, et al.
Published: (2025)
Human activities influence mule deer use of overpasses across multiple scales
by: Kaela M. Hamilton, et al.
Published: (2024)
by: Kaela M. Hamilton, et al.
Published: (2024)
ASVRI-Legal: Fine-Tuning LLMs with Retrieval Augmented Generation for Enhanced Legal Regulation
by: Octadion, One, et al.
Published: (2025)
by: Octadion, One, et al.
Published: (2025)
Scaling Towards the Information Boundary of Instruction Sets: The Infinity Instruct Subject Technical Report
by: Du, Li, et al.
Published: (2025)
by: Du, Li, et al.
Published: (2025)
Parsing Through Boundaries in Chinese Word Segmentation
by: Chen, Yige, et al.
Published: (2025)
by: Chen, Yige, et al.
Published: (2025)
Expect the unexpected: Harnessing Sentence Completion for Sarcasm Detection
by: Joshi, Aditya, et al.
Published: (2017)
by: Joshi, Aditya, et al.
Published: (2017)
CLERC: A Dataset for Legal Case Retrieval and Retrieval-Augmented Analysis Generation
by: Hou, Abe Bohan, et al.
Published: (2024)
by: Hou, Abe Bohan, et al.
Published: (2024)
Modeling Sequential Sentence Relation to Improve Cross-lingual Dense Retrieval
by: Zhang, Shunyu, et al.
Published: (2023)
by: Zhang, Shunyu, et al.
Published: (2023)
ParetoRAG: Leveraging Sentence-Context Attention for Robust and Efficient Retrieval-Augmented Generation
by: Yao, Ruobing, et al.
Published: (2025)
by: Yao, Ruobing, et al.
Published: (2025)
LRAGE: Legal Retrieval Augmented Generation Evaluation Tool
by: Park, Minhu, et al.
Published: (2025)
by: Park, Minhu, et al.
Published: (2025)
Similar Items
-
KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing Applications
by: Bommarito, Michael J, et al.
Published: (2025) -
The KL3M Data Project: Copyright-Clean Training Resources for Large Language Models
by: Bommarito II, Michael J, et al.
Published: (2025) -
Natural Language Processing in the Legal Domain
by: Hartung, Dirk, et al.
Published: (2023) -
OpenGloss: A Synthetic Encyclopedic Dictionary and Semantic Knowledge Graph
by: Bommarito II, Michael J.
Published: (2025) -
Needles at Scale: LLM-Assisted Target Selection for Windows Vulnerability Research
by: Bommarito II, Michael J.
Published: (2026)