Saved in:
Bibliographic Details
Main Author: Meng, Wei
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.21100
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915416322867200
author Meng, Wei
author_facet Meng, Wei
contents This paper introduces TACTIC-GRAPHS, a system that combines spectral graph theory and multimodal graph neural reasoning for semantic understanding and threat detection in tactical video under high noise and weak structure. The framework incorporates spectral embedding, temporal causal edge modeling, and discriminative path inference across heterogeneous modalities. A semantic-aware keyframe extraction method fuses visual, acoustic, and action cues to construct temporal graphs. Using graph attention and Laplacian spectral mapping, the model performs cross-modal weighting and causal signal analysis. Experiments on TACTIC-AVS and TACTIC-Voice datasets show 89.3 percent accuracy in temporal alignment and over 85 percent recognition of complete threat chains, with node latency within plus-minus 150 milliseconds. The approach enhances structural interpretability and supports applications in surveillance, defense, and intelligent security systems.
format Preprint
id arxiv_https___arxiv_org_abs_2507_21100
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle A Tactical Behaviour Recognition Framework Based on Causal Multimodal Reasoning: A Study on Covert Audio-Video Analysis Combining GAN Structure Enhancement and Phonetic Accent Modelling
Meng, Wei
Computers and Society
Artificial Intelligence
Computer Vision and Pattern Recognition
05C82, 68T07, 68T05, 62H30
I.2.10; I.4.8; H.5.1; H.2.8
This paper introduces TACTIC-GRAPHS, a system that combines spectral graph theory and multimodal graph neural reasoning for semantic understanding and threat detection in tactical video under high noise and weak structure. The framework incorporates spectral embedding, temporal causal edge modeling, and discriminative path inference across heterogeneous modalities. A semantic-aware keyframe extraction method fuses visual, acoustic, and action cues to construct temporal graphs. Using graph attention and Laplacian spectral mapping, the model performs cross-modal weighting and causal signal analysis. Experiments on TACTIC-AVS and TACTIC-Voice datasets show 89.3 percent accuracy in temporal alignment and over 85 percent recognition of complete threat chains, with node latency within plus-minus 150 milliseconds. The approach enhances structural interpretability and supports applications in surveillance, defense, and intelligent security systems.
title A Tactical Behaviour Recognition Framework Based on Causal Multimodal Reasoning: A Study on Covert Audio-Video Analysis Combining GAN Structure Enhancement and Phonetic Accent Modelling
topic Computers and Society
Artificial Intelligence
Computer Vision and Pattern Recognition
05C82, 68T07, 68T05, 62H30
I.2.10; I.4.8; H.5.1; H.2.8
url https://arxiv.org/abs/2507.21100