Saved in:
| Main Authors: | Keiser, John, Lemire, Daniel |
|---|---|
| Format: | Preprint |
| Published: |
2020
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2010.03090 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On-Demand JSON: A Better Way to Parse Documents?
by: Keiser, John, et al.
Published: (2023)
by: Keiser, John, et al.
Published: (2023)
On‐demand JSON: A better way to parse documents?
by: John Keiser, et al.
Published: (2024)
by: John Keiser, et al.
Published: (2024)
Fixing ill-formed UTF-16 strings with SIMD instructions
by: Clausecker, Robert, et al.
Published: (2026)
by: Clausecker, Robert, et al.
Published: (2026)
Parsing Gigabytes of JSON per Second
by: Langdale, Geoff, et al.
Published: (2019)
by: Langdale, Geoff, et al.
Published: (2019)
Back to Bytes: Revisiting Tokenization Through UTF-8
by: Moryossef, Amit, et al.
Published: (2025)
by: Moryossef, Amit, et al.
Published: (2025)
UTF-8 Plumbing: Byte-level Tokenizers Unavoidably Enable LLMs to Generate Ill-formed UTF-8
by: Firestone, Preston, et al.
Published: (2025)
by: Firestone, Preston, et al.
Published: (2025)
ByteCard: Enhancing ByteDance's Data Warehouse with Learned Cardinality Estimation
by: Han, Yuxing, et al.
Published: (2024)
by: Han, Yuxing, et al.
Published: (2024)
One Join Order Does Not Fit All: Reducing Intermediate Results with Per-Split Query Plans
by: He, Yujun, et al.
Published: (2025)
by: He, Yujun, et al.
Published: (2025)
ByteHouse: ByteDance's Cloud-Native Data Warehouse for Real-Time Multimodal Data Analytics
by: Han, Yuxing, et al.
Published: (2026)
by: Han, Yuxing, et al.
Published: (2026)
Parsing Millions of DNS Records Per Second
by: Jeroen Koekkoek, et al.
Published: (2024)
by: Jeroen Koekkoek, et al.
Published: (2024)
A General Framework for Per-record Differential Privacy
by: Chen, Xinghe, et al.
Published: (2025)
by: Chen, Xinghe, et al.
Published: (2025)
Less Is More? When Dataset Context Hurts LLM-Generated Dataset Descriptions
by: Gan, Lisa-Yao, et al.
Published: (2026)
by: Gan, Lisa-Yao, et al.
Published: (2026)
Privately Answering Queries on Skewed Data via Per Record Differential Privacy
by: Seeman, Jeremy, et al.
Published: (2023)
by: Seeman, Jeremy, et al.
Published: (2023)
Apples in the Apple Library--How One Library Took a Byte.
by: Ertel, Monica
Published: (1983)
by: Ertel, Monica
Published: (1983)
From RDF Graph Validation to RDF Dataset Validation with SHACL-DS
by: Dao, Davan Chiem, et al.
Published: (2025)
by: Dao, Davan Chiem, et al.
Published: (2025)
Less is More: Efficient Time Series Dataset Condensation via Two-fold Modal Matching--Extended Version
by: Miao, Hao, et al.
Published: (2024)
by: Miao, Hao, et al.
Published: (2024)
Process Faster, Pay Less: Functional Isolation for Stream Processing
by: Zapridou, Eleni, et al.
Published: (2026)
by: Zapridou, Eleni, et al.
Published: (2026)
StreamShield: A Production-Proven Resiliency Solution for Apache Flink at ByteDance
by: Fang, Yong, et al.
Published: (2026)
by: Fang, Yong, et al.
Published: (2026)
Transcoding Unicode Characters with AVX-512 Instructions
by: Clausecker, Robert, et al.
Published: (2022)
by: Clausecker, Robert, et al.
Published: (2022)
Limitations of Validity Intervals in Data Freshness Management
by: Kang, Kyoung-Don
Published: (2024)
by: Kang, Kyoung-Don
Published: (2024)
Automatic String Data Validation with Pattern Discovery
by: Lin, Xinwei, et al.
Published: (2024)
by: Lin, Xinwei, et al.
Published: (2024)
Implementing Decentralized Per-Partition Automatic Failover in Azure Cosmos DB
by: Rowe, Josh, et al.
Published: (2025)
by: Rowe, Josh, et al.
Published: (2025)
Poseidon: A OneGraph Engine
by: Bebee, Brad, et al.
Published: (2025)
by: Bebee, Brad, et al.
Published: (2025)
Conformance Checking for Less: Efficient Conformance Checking for Long Event Sequences
by: Bogdanov, Eli, et al.
Published: (2025)
by: Bogdanov, Eli, et al.
Published: (2025)
Benchmarking Large Language Models for Knowledge Graph Validation
by: Shami, Farzad, et al.
Published: (2026)
by: Shami, Farzad, et al.
Published: (2026)
Version Control System for Data with MatrixOne
by: Gou, Hongshen, et al.
Published: (2026)
by: Gou, Hongshen, et al.
Published: (2026)
Fast Algorithm for Embedded Order Dependency Validation (Extended Version)
by: Ramos, Alejandro, et al.
Published: (2023)
by: Ramos, Alejandro, et al.
Published: (2023)
Automated Data Quality Validation in an End-to-End GNN Framework
by: Dong, Sijie, et al.
Published: (2025)
by: Dong, Sijie, et al.
Published: (2025)
Faster Base64 Encoding and Decoding Using AVX2 Instructions
by: Muła, Wojciech, et al.
Published: (2017)
by: Muła, Wojciech, et al.
Published: (2017)
The Past Still Matters: A Temporally-Valid Data Discovery System
by: Esmailoghli, Mahdi, et al.
Published: (2025)
by: Esmailoghli, Mahdi, et al.
Published: (2025)
HotStuff-1: Linear Consensus with One-Phase Speculation
by: Kang, Dakai, et al.
Published: (2024)
by: Kang, Dakai, et al.
Published: (2024)
Statistical Validation of Column Matching in the Database Schema Evolution of the Brazilian Public School Census
by: Yamanaka, Muriki G., et al.
Published: (2024)
by: Yamanaka, Muriki G., et al.
Published: (2024)
This is Going to Sound Crazy, But What If We Used Large Language Models to Boost Automatic Database Tuning Algorithms By Leveraging Prior History? We Will Find Better Configurations More Quickly Than Retraining From Scratch!
by: Zhang, William, et al.
Published: (2025)
by: Zhang, William, et al.
Published: (2025)
Validation of Modern JSON Schema: Formalization and Complexity
by: Attouche, Lyes, et al.
Published: (2023)
by: Attouche, Lyes, et al.
Published: (2023)
OneDB: A Distributed Multi-Metric Data Similarity Search System
by: Qian, Tang, et al.
Published: (2025)
by: Qian, Tang, et al.
Published: (2025)
One Size Does NOT Fit All: On the Importance of Physical Representations for Datalog Evaluation
by: Rassau, Nick, et al.
Published: (2026)
by: Rassau, Nick, et al.
Published: (2026)
Moving from Books to Bytes.
by: Albanese, Andrew Richard
Published: (2001)
by: Albanese, Andrew Richard
Published: (2001)
Validating Temporal Compliance Patterns: A Unified Approach with $MTL_f$ over various Data Models
by: Zaki, Nesma M., et al.
Published: (2024)
by: Zaki, Nesma M., et al.
Published: (2024)
Bespoke OLAP: Synthesizing Workload-Specific One-size-fits-one Database Engines
by: Wehrstein, Johannes, et al.
Published: (2026)
by: Wehrstein, Johannes, et al.
Published: (2026)
Will My Favorite Chases Terminate if Evaluating Conjunctive Queries Does? One Does Not Simply Decide This
by: Larroque, Lucas, et al.
Published: (2026)
by: Larroque, Lucas, et al.
Published: (2026)
Similar Items
-
On-Demand JSON: A Better Way to Parse Documents?
by: Keiser, John, et al.
Published: (2023) -
On‐demand JSON: A better way to parse documents?
by: John Keiser, et al.
Published: (2024) -
Fixing ill-formed UTF-16 strings with SIMD instructions
by: Clausecker, Robert, et al.
Published: (2026) -
Parsing Gigabytes of JSON per Second
by: Langdale, Geoff, et al.
Published: (2019) -
Back to Bytes: Revisiting Tokenization Through UTF-8
by: Moryossef, Amit, et al.
Published: (2025)