Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Recurso digital |
| Language: | |
| Published: |
Zenodo
2026
|
| Online Access: | https://doi.org/10.5281/zenodo.19642813 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866901645531545600 |
|---|---|
| author | Aritra Sarkar shaheenaliii Varshini S |
| author_facet | Aritra Sarkar shaheenaliii Varshini S |
| contents | <h2>What's New</h2> <pre><code> </code></pre> <p>### Config-Driven Pipeline</p> <ul> <li>Centralized <code>config.yaml</code> replacing all hardcoded constants</li> <li>New <code>config_loader.py</code> for dynamic pipeline configuration</li> <li>YAML-based <code>datasets.yaml</code> schema validation for data integrity</li> </ul> <h3> Reproducibility & Publication</h3> <ul> <li>Added <code>REPRODUCIBILITY.md</code> with full reproduction steps</li> <li>Bundled SoftwareX paper (<code>paper.tex</code>, <code>paper.bib</code>)</li> <li>TikZ architecture diagram (<code>architecture.tex</code>)</li> </ul> <h3> CI & Infrastructure Fixes</h3> <ul> <li>Fixed flake8 F824 — removed unused <code>global</code> declarations in <code>state_mapping.py</code></li> <li>Fixed Docker build — switched from <code>openjdk-21</code> to <code>openjdk-17</code> (Bookworm compat)</li> <li>Fixed test failures — added LADAKH to canonical states, corrected merged UT mappings</li> <li>All 84 tests passing across Python 3.9 / 3.10 / 3.11</li> </ul> <h3> State Mapping</h3> <ul> <li>YAML-driven state name overrides (<code>state_config.yaml</code>)</li> <li>Added Ladakh as canonical state/UT</li> <li>Proper handling of merged Dadra & Nagar Haveli + Daman & Diu aliases</li> </ul> |
| format | Recurso digital |
| id | zenodo_https___doi_org_10_5281_zenodo_19642813 |
| institution | Zenodo |
| language | |
| publishDate | 2026 |
| publisher | Zenodo |
| record_format | zenodo |
| spellingShingle | aritra0309/hadoop-crime-project: Config-Driven Pipeline and Reproducibility Package Aritra Sarkar shaheenaliii Varshini S <h2>What's New</h2> <pre><code> </code></pre> <p>### Config-Driven Pipeline</p> <ul> <li>Centralized <code>config.yaml</code> replacing all hardcoded constants</li> <li>New <code>config_loader.py</code> for dynamic pipeline configuration</li> <li>YAML-based <code>datasets.yaml</code> schema validation for data integrity</li> </ul> <h3> Reproducibility & Publication</h3> <ul> <li>Added <code>REPRODUCIBILITY.md</code> with full reproduction steps</li> <li>Bundled SoftwareX paper (<code>paper.tex</code>, <code>paper.bib</code>)</li> <li>TikZ architecture diagram (<code>architecture.tex</code>)</li> </ul> <h3> CI & Infrastructure Fixes</h3> <ul> <li>Fixed flake8 F824 — removed unused <code>global</code> declarations in <code>state_mapping.py</code></li> <li>Fixed Docker build — switched from <code>openjdk-21</code> to <code>openjdk-17</code> (Bookworm compat)</li> <li>Fixed test failures — added LADAKH to canonical states, corrected merged UT mappings</li> <li>All 84 tests passing across Python 3.9 / 3.10 / 3.11</li> </ul> <h3> State Mapping</h3> <ul> <li>YAML-driven state name overrides (<code>state_config.yaml</code>)</li> <li>Added Ladakh as canonical state/UT</li> <li>Proper handling of merged Dadra & Nagar Haveli + Daman & Diu aliases</li> </ul> |
| title | aritra0309/hadoop-crime-project: Config-Driven Pipeline and Reproducibility Package |
| url | https://doi.org/10.5281/zenodo.19642813 |