Guardado en:
Detalles Bibliográficos
Autores principales: Hettiarachchi, Hansi, Dridi, Amna, Gaber, Mohamed Medhat, Parsafard, Pouyan, Bocaneala, Nicoleta, Breitenfelder, Katja, Costa, Gonçal, Hedblom, Maria, Juganaru-Mathieu, Mihaela, Mecharnia, Thamer, Park, Sumee, Tan, He, Tawil, Abdel-Rahman H., Vakaj, Edlira
Formato: Preprint
Publicado: 2024
Materias:
Acceso en línea:https://arxiv.org/abs/2403.02231
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866915156020166656
author Hettiarachchi, Hansi
Dridi, Amna
Gaber, Mohamed Medhat
Parsafard, Pouyan
Bocaneala, Nicoleta
Breitenfelder, Katja
Costa, Gonçal
Hedblom, Maria
Juganaru-Mathieu, Mihaela
Mecharnia, Thamer
Park, Sumee
Tan, He
Tawil, Abdel-Rahman H.
Vakaj, Edlira
author_facet Hettiarachchi, Hansi
Dridi, Amna
Gaber, Mohamed Medhat
Parsafard, Pouyan
Bocaneala, Nicoleta
Breitenfelder, Katja
Costa, Gonçal
Hedblom, Maria
Juganaru-Mathieu, Mihaela
Mecharnia, Thamer
Park, Sumee
Tan, He
Tawil, Abdel-Rahman H.
Vakaj, Edlira
contents Automatic Compliance Checking (ACC) within the Architecture, Engineering, and Construction (AEC) sector necessitates automating the interpretation of building regulations to achieve its full potential. Converting textual rules into machine-readable formats is challenging due to the complexities of natural language and the scarcity of resources for advanced Machine Learning (ML). Addressing these challenges, we introduce CODE-ACCORD, a dataset of 862 sentences from the building regulations of England and Finland. Only the self-contained sentences, which express complete rules without needing additional context, were considered as they are essential for ACC. Each sentence was manually annotated with entities and relations by a team of 12 annotators to facilitate machine-readable rule generation, followed by careful curation to ensure accuracy. The final dataset comprises 4,297 entities and 4,329 relations across various categories, serving as a robust ground truth. CODE-ACCORD supports a range of ML and Natural Language Processing (NLP) tasks, including text classification, entity recognition, and relation extraction. It enables applying recent trends, such as deep neural networks and large language models, to ACC.
format Preprint
id arxiv_https___arxiv_org_abs_2403_02231
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle CODE-ACCORD: A Corpus of building regulatory data for rule generation towards automatic compliance checking
Hettiarachchi, Hansi
Dridi, Amna
Gaber, Mohamed Medhat
Parsafard, Pouyan
Bocaneala, Nicoleta
Breitenfelder, Katja
Costa, Gonçal
Hedblom, Maria
Juganaru-Mathieu, Mihaela
Mecharnia, Thamer
Park, Sumee
Tan, He
Tawil, Abdel-Rahman H.
Vakaj, Edlira
Information Retrieval
Automatic Compliance Checking (ACC) within the Architecture, Engineering, and Construction (AEC) sector necessitates automating the interpretation of building regulations to achieve its full potential. Converting textual rules into machine-readable formats is challenging due to the complexities of natural language and the scarcity of resources for advanced Machine Learning (ML). Addressing these challenges, we introduce CODE-ACCORD, a dataset of 862 sentences from the building regulations of England and Finland. Only the self-contained sentences, which express complete rules without needing additional context, were considered as they are essential for ACC. Each sentence was manually annotated with entities and relations by a team of 12 annotators to facilitate machine-readable rule generation, followed by careful curation to ensure accuracy. The final dataset comprises 4,297 entities and 4,329 relations across various categories, serving as a robust ground truth. CODE-ACCORD supports a range of ML and Natural Language Processing (NLP) tasks, including text classification, entity recognition, and relation extraction. It enables applying recent trends, such as deep neural networks and large language models, to ACC.
title CODE-ACCORD: A Corpus of building regulatory data for rule generation towards automatic compliance checking
topic Information Retrieval
url https://arxiv.org/abs/2403.02231