Gardado en:
Detalles Bibliográficos
Autor Principal: Anonymous, Author
Formato: Recurso digital
Idioma:
Publicado: Zenodo 2026
Acceso en liña:https://doi.org/10.5281/zenodo.20072522
Tags: Engadir etiqueta
Sen Etiquetas, Sexa o primeiro en etiquetar este rexistro!
Table of Contents:
  • <p>This is a zip including 12 real-life and 8 synthetic datasets used in the experiments. </p> <p>## Explanation About Datasets</p> <p>Each dataset is comprised of three files</p> <p>- schema (.json)<br>- positive instances (.jsonl)<br>- negative instances (.jsonl)</p> <p>The positive instances are<br>- retrieved from public websites or APIs (for real-life datasets)<br>- generated using the ground truth schemas (for synthetic datasets)</p> <p>The schemas are<br>- retrieved from public websites or manually translated into `.json` formats using the official documentations (for real-life datasets)<br>- retrieved from the [JSON Schema Store](https://www.schemastore.org/json/) (syntehtic datasets)</p> <p>The negative instances are<br>- generated using the ground truth schemas</p> <p>## Negative Instances Generation</p> <p>You can simply generate negative instances using the following codes.</p> <p>This may take a few hours, so we recommend you to run this process with `tmux`.<br>```bash<br>cd SyntheticDatasetGen<br>./generateAllNegativeSamples.sh<br>```</p> <p> </p>