Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yang, Zhenkui, Huang, Zeyi, Wang, Ge, Ding, Han, Han, Tony Xiao, Wang, Fei
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2504.14621
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917994155737088
author	Yang, Zhenkui Huang, Zeyi Wang, Ge Ding, Han Han, Tony Xiao Wang, Fei
author_facet	Yang, Zhenkui Huang, Zeyi Wang, Ge Ding, Han Han, Tony Xiao Wang, Fei
contents	Wireless signal-based human sensing technologies, such as WiFi, millimeter-wave (mmWave) radar, and Radio Frequency Identification (RFID), enable the detection and interpretation of human presence, posture, and activities, thereby providing critical support for applications in public security, healthcare, and smart environments. These technologies exhibit notable advantages due to their non-contact operation and environmental adaptability; however, existing systems often fail to leverage the textual information inherent in datasets. To address this, we propose an innovative text-enhanced wireless sensing framework, WiTalk, that seamlessly integrates semantic knowledge through three hierarchical prompt strategies-label-only, brief description, and detailed action description-without requiring architectural modifications or incurring additional data costs. We rigorously validate this framework across three public benchmark datasets: XRF55 for human action recognition (HAR), and WiFiTAL and XRFV2 for WiFi temporal action localization (TAL). Experimental results demonstrate significant performance improvements: on XRF55, accuracy for WiFi, RFID, and mmWave increases by 3.9%, 2.59%, and 0.46%, respectively; on WiFiTAL, the average performance of WiFiTAD improves by 4.98%; and on XRFV2, the mean average precision gains across various methods range from 4.02% to 13.68%. Our codes have been included in https://github.com/yangzhenkui/WiTalk.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_14621
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Talk is Not Always Cheap: Promoting Wireless Sensing Models with Text Prompts Yang, Zhenkui Huang, Zeyi Wang, Ge Ding, Han Han, Tony Xiao Wang, Fei Computer Vision and Pattern Recognition Wireless signal-based human sensing technologies, such as WiFi, millimeter-wave (mmWave) radar, and Radio Frequency Identification (RFID), enable the detection and interpretation of human presence, posture, and activities, thereby providing critical support for applications in public security, healthcare, and smart environments. These technologies exhibit notable advantages due to their non-contact operation and environmental adaptability; however, existing systems often fail to leverage the textual information inherent in datasets. To address this, we propose an innovative text-enhanced wireless sensing framework, WiTalk, that seamlessly integrates semantic knowledge through three hierarchical prompt strategies-label-only, brief description, and detailed action description-without requiring architectural modifications or incurring additional data costs. We rigorously validate this framework across three public benchmark datasets: XRF55 for human action recognition (HAR), and WiFiTAL and XRFV2 for WiFi temporal action localization (TAL). Experimental results demonstrate significant performance improvements: on XRF55, accuracy for WiFi, RFID, and mmWave increases by 3.9%, 2.59%, and 0.46%, respectively; on WiFiTAL, the average performance of WiFiTAD improves by 4.98%; and on XRFV2, the mean average precision gains across various methods range from 4.02% to 13.68%. Our codes have been included in https://github.com/yangzhenkui/WiTalk.
title	Talk is Not Always Cheap: Promoting Wireless Sensing Models with Text Prompts
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2504.14621

Similar Items