Saved in:
Bibliographic Details
Main Authors: Ahmad, Niaz, Lee, Youngmoon, Wang, Guanghui
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.19032
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910920299511808
author Ahmad, Niaz
Lee, Youngmoon
Wang, Guanghui
author_facet Ahmad, Niaz
Lee, Youngmoon
Wang, Guanghui
contents We introduce VISUALCENT, a unified human pose and instance segmentation framework to address generalizability and scalability limitations to multi person visual human analysis. VISUALCENT leverages centroid based bottom up keypoint detection paradigm and uses Keypoint Heatmap incorporating Disk Representation and KeyCentroid to identify the optimal keypoint coordinates. For the unified segmentation task, an explicit keypoint is defined as a dynamic centroid called MaskCentroid to swiftly cluster pixels to specific human instance during rapid changes in human body movement or significantly occluded environment. Experimental results on COCO and OCHuman datasets demonstrate VISUALCENTs accuracy and real time performance advantages, outperforming existing methods in mAP scores and execution frame rate per second. The implementation is available on the project page.
format Preprint
id arxiv_https___arxiv_org_abs_2504_19032
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle VISUALCENT: Visual Human Analysis using Dynamic Centroid Representation
Ahmad, Niaz
Lee, Youngmoon
Wang, Guanghui
Computer Vision and Pattern Recognition
Artificial Intelligence
We introduce VISUALCENT, a unified human pose and instance segmentation framework to address generalizability and scalability limitations to multi person visual human analysis. VISUALCENT leverages centroid based bottom up keypoint detection paradigm and uses Keypoint Heatmap incorporating Disk Representation and KeyCentroid to identify the optimal keypoint coordinates. For the unified segmentation task, an explicit keypoint is defined as a dynamic centroid called MaskCentroid to swiftly cluster pixels to specific human instance during rapid changes in human body movement or significantly occluded environment. Experimental results on COCO and OCHuman datasets demonstrate VISUALCENTs accuracy and real time performance advantages, outperforming existing methods in mAP scores and execution frame rate per second. The implementation is available on the project page.
title VISUALCENT: Visual Human Analysis using Dynamic Centroid Representation
topic Computer Vision and Pattern Recognition
Artificial Intelligence
url https://arxiv.org/abs/2504.19032