Saved in:
Bibliographic Details
Main Authors: Owino, Geofrey, Kasamani, Bernard Shibwabo, Abdelmoniem, Ahmed M., Wornyo, Edem
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2512.16271
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917153715781632
author Owino, Geofrey
Kasamani, Bernard Shibwabo
Abdelmoniem, Ahmed M.
Wornyo, Edem
author_facet Owino, Geofrey
Kasamani, Bernard Shibwabo
Abdelmoniem, Ahmed M.
Wornyo, Edem
contents Accurate and interpretable classification of infant cry paralinguistics is essential for early detection of neonatal distress and clinical decision support. However, many existing deep learning methods rely on correlation-driven acoustic representations, which makes them vulnerable to noise, spurious cues, and domain shifts across recording environments. We propose DACH-TIC, a Domain-Agnostic Causal-Aware Hierarchical Audio Transformer for robust infant cry classification. The model integrates causal attention, hierarchical representation learning, multi-task supervision, and adversarial domain generalization within a unified framework. DACH-TIC employs a structured transformer backbone with local token-level and global semantic encoders, augmented by causal attention masking and controlled perturbation training to approximate counterfactual acoustic variations. A domain-adversarial objective promotes environment-invariant representations, while multi-task learning jointly optimizes cry type recognition, distress intensity estimation, and causal relevance prediction. The model is evaluated on the Baby Chillanto and Donate-a-Cry datasets, with ESC-50 environmental noise overlays for domain augmentation. Experimental results show that DACH-TIC outperforms state-of-the-art baselines, including HTS-AT and SE-ResNet Transformer, achieving improvements of 2.6 percent in accuracy and 2.2 points in macro-F1 score, alongside enhanced causal fidelity. The model generalizes effectively to unseen acoustic environments, with a domain performance gap of only 2.4 percent, demonstrating its suitability for real-world neonatal acoustic monitoring systems.
format Preprint
id arxiv_https___arxiv_org_abs_2512_16271
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Domain-Agnostic Causal-Aware Audio Transformer for Infant Cry Classification
Owino, Geofrey
Kasamani, Bernard Shibwabo
Abdelmoniem, Ahmed M.
Wornyo, Edem
Sound
Artificial Intelligence
Accurate and interpretable classification of infant cry paralinguistics is essential for early detection of neonatal distress and clinical decision support. However, many existing deep learning methods rely on correlation-driven acoustic representations, which makes them vulnerable to noise, spurious cues, and domain shifts across recording environments. We propose DACH-TIC, a Domain-Agnostic Causal-Aware Hierarchical Audio Transformer for robust infant cry classification. The model integrates causal attention, hierarchical representation learning, multi-task supervision, and adversarial domain generalization within a unified framework. DACH-TIC employs a structured transformer backbone with local token-level and global semantic encoders, augmented by causal attention masking and controlled perturbation training to approximate counterfactual acoustic variations. A domain-adversarial objective promotes environment-invariant representations, while multi-task learning jointly optimizes cry type recognition, distress intensity estimation, and causal relevance prediction. The model is evaluated on the Baby Chillanto and Donate-a-Cry datasets, with ESC-50 environmental noise overlays for domain augmentation. Experimental results show that DACH-TIC outperforms state-of-the-art baselines, including HTS-AT and SE-ResNet Transformer, achieving improvements of 2.6 percent in accuracy and 2.2 points in macro-F1 score, alongside enhanced causal fidelity. The model generalizes effectively to unseen acoustic environments, with a domain performance gap of only 2.4 percent, demonstrating its suitability for real-world neonatal acoustic monitoring systems.
title Domain-Agnostic Causal-Aware Audio Transformer for Infant Cry Classification
topic Sound
Artificial Intelligence
url https://arxiv.org/abs/2512.16271